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March  1987 
Introduction 

Non-cooperative  game  theory  is  a  way  of  modelling  and  analyzing  situations 
in  which  each  player's  optimal  decisions  depend  on  his  beliefs  or  expectations 
about  the  play  of  his  opponents.   The  distinguishing  aspect  of  the  theory  is 
its  insistence  that  players  should  not  hold  arbitrary  beliefs  about  the  play 
of  their  opponents.   Instead,  each  player  should  try  to  predict  his  opponents' 
play,  using  his  knowledge  of  the  rules  of  the  game  and  the  assumption  that  his 
opponents  are  themselves  rational,  and  are  thus  trying  to  make  their  own 
predictions  and  to  maximize  their  own  payoffs.   Game-theoretic  methodology  has 
caused  deep  and  wide-reaching  changes  in  the  way  that  practitioners  think  about 
key  issues  in  oligopoly  theory,  much  as  the  idea  of  rational  expectations  has 
revolutionized  the  study  of  macroeconomics.   This  essay  tries  to  provide  an 
overview  of  those  aspects  of  the  theory  which  are  most  commonly  used  by 
industrial  organization  economists,  and  to  sketch  a  few  of  the  most  important 
or  illuminating  applications.   We  have  omitted  many  interesting  game-theoretic 
topics  which  have  not  yet  been  widely  applied. 

!•   Games,  Strategies,  and  Equilibria 

This  section  introduces  the  two  formalisms  used  to  represent  noncooperative 
games,  and  then  discusses  what  we  might  mean  by  a  "reasonable  prediction"  for 
how  a  game  will  be  played.   This  will  lead  us  to  the  ideas  of  Nash  and 
subgame -perfect  equilibria. 


The  Extensive  and  Normal  Forms 

There  are  two  (almost)  equivalent  ways  of  formulating  a  game.  The  first 
is  the  extensive  form.    An  extensive  form  specifies:  (1)  the  order  of  play; 
(2)  the  choices  available  to  a  player  whenever  it  is  his  turn  to  move;  (3)  the 
information  a  player  has  at  each  of  these  turns;  (4)  the  payoffs  to  each  player 
as  a  function  of  the  moves  selected;  and  (5)  the  probability  distributions  for 
moves  by  "nature." 

The  extensive  form  is  depicted  by  a  "game  tree,"  such  as  those  in  Figures 
1  and  2.   Game  trees  are  the  multi-player  generalization  of  the  decision  trees 
used  in  decision  theory.   The  open  circle  is  the  first  or  initial  node.   The 
tree's  structure  says  which  nodes  follow  which,  and  the  numbers  at  each  node 
indicate  which  player  has  the  move  there.   (Part  of  what  is  meant  by  "tree"  is 
that  this  structure  is  an  ordering- -two  distinct  nodes  cannot  have  the  same 
successor.   Thus  for  example  in  chess,  two  different  sequences  of  moves  which 
lead  to  the  same  position  on  the  board  are  assigned  different  nodes  in  the  tree. 
See  Kreps-Wilson  (1982b)  for  a  more  formal  discussion  of  this  and  other  details 
of  extensive  games.   See  also  the  classic  book  by  Luce  and  Raiffa  (1957)  which 
addresses  most  of  the  topics  of  this  section.)  The  dotted  line  connecting  two 
of  player  two's  nodes  indicate  that  these  two  nodes  are  in  the  same  "information 
set,   meaning  that  player  two  cannot  tell  which  of  the  two  actions  has  occurred 
when  it  is  his  turn  to  move.   Players  must  know  when  it  is  their  turn  to  move, 
so  different  players'  information  sets  cannot  intersect,  and  players  must  know 
which  choices  are  feasible,  so  all  nodes  in  the  same  information  set  must  allow 
the  same  choices.  We  will  restrict  attention  throughout  to  games  of  perfect 
recall,  in  which  each  player  always  knows  what  he  knew  previously,  including 


The  following  description  is  freely  adapted  from  Kreps-Wilson  (1982b). 


his  own  previous  actions.  This  implies  an  additional  restriction  on  the 
information  sets. 

Players  are  assumed  to  maximize  their  expected  utility,  given  their  beliefs 
about  the  actions  of  their  opponents  and  of  "Nature."  The  payoffs  corresponding 
to  each  sequence  of  actions  are  depicted  at  the  terminal  nodes  or  "outcomes 
of  the  tree;  (x,y)  at  a  terminal  node  means  that  player  one  gets  x  and  player 
two  gets  y.   The  different  initial  nodes  in  Figure  3  correspond  to  different 
moves  by  Nature,  i.e.  different  "states  of  the  world."   (Note  that  this  is  a 
one-player  game.)  There  is  no  loss  in  generality  in  placing  all  of  Nature's 
moves  at  the  start,  because  players  need  not  receive  information  about  these 
moves  until  later  on.  The  initial  assessment  p  is  a  probability  measure  over 
the  initial  nodes.   The  formal  models  we  will  discuss  will  always  assume  that 
this  assessment,  the  terminal  payoffs,  and  the  entire  structure  of  the  tree  is 
"common  knowledge,"  meaning  that  all  players  know  it,  and  they  know  that  their 
opponents  know  it,  and  so  on.  This  does  not  mean  that  all  players  are  perfectly 

informed,  but  rather  that  we  have  explicitly  depicted  all  the  differences  in 

2 
information  in  our  tree.    The  extensive  form  will  be  taken  to  fully  describe 

the  real  situation--all  possible  moves  and  observations  will  be  explicitly 

specified.   For  example,  if  the  "same  game"  is  played  three  times,  the  "real 

game"  to  be  analyzed  is  the  three-fold  replication.   The  idealized  situation 

we  have  in  mind  is  that,  possibly  after  some  "pre-play  communication,"  players 

are  in  separate  rooms.  They  are  informed  of  the  course  of  play  only  by  signals 

corresponding  to  the  information  structure  of  the  tree,  and  push  various  buttons 

corresponding  to  the  feasible  actions  at  their  various  information  sets.   Once 
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See  Aumann  (1976)  and  Bradenburger-Dekel  (1985)  for  a  formal  treatment  of 

common  knowledge,  and  also  the  Mertens-Zamir  (1985)  paper  we  mention  in  Section 
3. 


play  begins,  players  cannot  explicitly  communicate,  except  as  provided  by  the 
rules  of  the  game,   (In  many  situations,  it  is  difficult  to  explicitly  model 
all  the  possible  means  of  communication.  This  has  spurred  interest  in  shorthand 
descriptions  of  the  effects  of  communication.  See  our  discussion  of  correlated 
equilibrium.)  A  behavioral  strategy  for  player  i  is  a  map  that  specifies  for 
each  of  his  information  sets,  a  probability  distribution  over  the  actions  that 
are  feasible  at  that  set.  A  behavioral  pure  strategy  specifies  a  single  action 
at  each  information  set,  as  opposed  to  a  probability  mixture.   (Later  we  will 
discuss  whether  it  might  be  reasonable  for  a  player  to  randomize.)  A  given 
specification  of  behavioral  strategies  and  an  initial  assessment  generates  a 
probability  distribution  over  terminal  nodes,  and  thus  over  payoffs,  in  the 
obvious  way. 

The  distinguishing  feature  of  game  theory  is  that  each  player's  beliefs 
about  his  opponents  actions  are  not  arbitrarily  specified.   Instead,  each  player 
is  assumed  to  believe  that  his  opponents  are  "rational",  and  to  use  that 
information  in  formulating  his  predictions  of  their  play.  Any  predictions  that 
are  inconsistent  with  this  presumed,  but  vaguely  specified,  rationality  are 
rejected. 

To  help  clarify  what  we  mean,  let  us  return  to  the  game  depicted  in  Figure 
1.   Is  there  a  reasonable  prediction  for  how  this  game  should/will  be  played? 
One  way  to  look  for  a  prediction  is  to  apply  backwards  induction.   If  player 
two's  information  set  is  reached,  and  the  payoffs  are  as  specified,  then  two 
should  play  L  .   Then  if  player  one  knows  that  player  two  knows  the  payoff, 
player  one  should  play  U  .   Is  this  a  good  prediction?  If  all  is  as  in  Figure 
1,  player  two  should  not  expect  player  one  to  play  D  .   What  should  two  tell 
himself  if  D  is  nevertheless  observed?   If  the  payoffs  are  guaranteed  to  be 
as  specified,  the  only  possible  explanation  is  that  player  one  made  a 


"niistake"--he  meant  to  play  U  but  somehow  he  failed  to  do  so.   This  analysis 
falls  apart  if  we  take  Figure  1  as  a  shorthand  description  for  a  game  which  is 
probably  as  depicted,  but  might  not  be,  so  that  playing  D  could  convey 
information  to  player  two.   We'll  say  more  about  this  in  Section  3.   The  key 
for  now  is  that  the  game  must  be  taken  as  an  exact  description  of  reality  for 
our  arguments  to  be  sound. 

In  Figure  1,  all  (both)  the  information  sets  are  singletons,  so  that  each 
player  knows  all  previous  actions  at  each  of  his  turns  to  move.  Games  like  this 
are  called  "games  of  perfect  information."  The  backwards  induction  argument 
used  above  is  called  "Kuhn's  algorithm"  (1953).   It  always  "works"  (yields  a 
conclusion)  in  finite  games  of  perfect  information,  and  yields  a  unique 
conclusion  as  long  as  no  two  terminal  nodes  give  any  player  exactly  the  same 
payoff.   Backwards  induction  will  not  yield  a  conclusion  in  games  of  imperfect 
information,  such  as  that  in  Figure  2.   Player  two's  optimal  choice  at  his 
information  set  depends  on  player  one's  previous  move,  which  player  two  has  not 
observed.   To  help  find  a  reasonable  prediction  for  this  game  we  introduce  the 
idea  of  the  normal  form. 

The  normal  form  representation  of  an  extensive  game  condenses  the  details 
of  the  tree  structure  into  three  elements:   The  set  of  players,   I  ;  each 
player's  strategy  space,  which  is  simply  the  set  of  his  behavioral  pure 
strategies;  and  a  payoff  function  mapping  strategy  selections  for  all  of  the 
players  to  their  payoffs.   We  will  use  S.   to  denote  player  i's   strategy 
space,  S   to  be  the  product  of  the  S.  ,  and  ir  :S->R  to  be  player  i's  payoff 
function.   A  triple   (I,S,tr)  completely  describes  a  normal  form. 

Normal  forms  for  two-player  games  are  often  depicted  as  matrices,  as  in 
Figure  4.   The  left-hand  matrix  is  the  normal  form  for  Figure  1,  while  the 
right-hand  one  corresponds  to  Figure  2.   Note  that  different  extensive  forms 


can  have  the  same  normal  form.   For  example,  Figure  2  is  a  "simultaneous-move" 
game,  in  which  neither  player  observes  his  opponent's  action  before  choosing 
his  own.   We  could  represent  this  game  equally  well  with  an  extensive  form  in 
which  player  two  moved  first. 

A  mixed  strategy  is  a  probability  distribution  over  the  normal-form 
strategies.   Payoffs  to  mixed  strategies  are  simply  the  expected  value  of  the 
corresponding  pure-strategy  payoffs.   We  will  denote  mixed  strategies  by  Z  , 
and  the  space  of  player  i's  mixed  strategies  by  I.  .  Although  different  mixed 
strategies  can  give  rise  to  the  same  behavior  strategies,  Kuhn  showed  that  the 
two  concepts  are  equivalent  in  games  of  perfect  recall--any  probability 

distribution  over  outcomes  that  can  be  generated  using  one  kind  of  randomization 

3 
can  be  duplicated  by  using  the  other. 

In  the  normal  form  corresponding  to  Figure  1,  choosing  L  gives  player 

two  at  least  as  high  a  payoff  as  choosing  R  regardless  of  player  one's  choice, 

and  gives  strictly  more  if  player  one  plays  D  .   In  such  a  case  we  say  that 

D  is  a  (weakly)  dominant  strategy  for  player  two.   (Strict  dominance  means  that 

the  strategy  is  strictly  better  for  all  choices  by  opponents.)   It  seems 

reasonable  that  no  player  should  expect  an  opponent  to  play  a  dominated 

strategy,  which  means  that  one  should  expect  that  two  will  play  L  .   This  is 

just  rephrasing  our  backwards  induction  argument.   The  analog  of  rolling 

backwards  through  the  tree  is  the  iterated  elimination  of  dominated  strategies: 

making  optimal  choices  at  the  last  nodes  is  simple  dominance,  folding  back  one 

step  is  first-order  iterated  dominance,  and  so  on.   (Actually  iterated  dominance 


3 

Two  strategies  for  a  player  which  differ  only  at  information  sets  which 

follow  a  deviation  by  that  player  yield  the  same  probability  distribution  over 
outcomes  for  any  strategy  selections  of  the  other  players.  Some  authors  define 
the  normal  form  as  identifying  such  equivalent  strategies. 


is  a  more  general  technique,  as  it  can  be  applied  to  games  of  imperfect 
information. ) 

Nash  Equilibrium 

The  normal  form  for  Figure  2  does  not  have  dominant  strategies.   Here  to 
make  predictions  we  will  have  to  accept  a  weaker  notion  of  "reasonableness," 
that  embodied  in  the  concept  of  a  Nash  equilibrium.   A  Nash  equilibrium  is  a 
strategy  selection  such  that  no  player  can  gain  by  playing  differently,  given 
the  strategy  of  his  opponent.   This  condition  is  stated  formally  as 

Definition:  Strategy  selection  s*  is  a  pure-strategy  Nash  equilibrium  of  the 
game  (1,8,11)  if  for  all  players  i  in  I  and  all  s.   in  S .  , 


(1)      iT^(s*)  >  Tr^(s.,s*_.)  . 


Here,  the  notation  (s*  .,s)  represents  the  strategy  selection  in  which  all 
players  but  i  play  according  to  s*  ,  while  i  plays  s.  .  Note  that  S"  can 
be  an  equilibrium  if  there  is  some  player  i  who  is  indifferent  between  S" . 
and  an  alternative,   s.  .   We  view  Nash  equilibrium  as  a  minimal  requirement 
that  a  proposed  solution  must  satisfy  to  be  "reasonable."  If  a  strategy 
selection  is  not  a  Nash  equilibrium,  then  all  players  know  that  some  player 
would  do  better  not  to  play  as  the  selection  specifies.   If  "reasonable"  is  to 
mean  anything,  it  should  rule  out  such  inconsistent  predictions.   Not  all  Nash 
equilibria  are  reasonable,  as  is  revealed  by  examining  the  extensive  and  normal 
forms  of  Figure  5.   The  backwards -induction  equilibrium  (D,L)  is  a  Nash 
equilibrium,  but  so  is  (U,R).   We  will  soon  discuss  the  idea  of  a  "perfect 
equilibrium,"  which  is  designed  to  formalize  the  idea  that  (U,R)  is  not 


reasonable.  The  perfection  notion  and  other  refinements  of  Nash  equilibrium 
do  not  help  with  the  following  problem.  Consider  a  game  like  that  in  Figure 
6.   The  only  Nash  equilibrium  is  (U,L),  yet  is  this  a  reasonable  prediction? 
It  depends  on  whether  the  players  are  sure  that  the  payoffs  are  exactly  as  we've 
specified,  and  that  their  opponents  are  "rational."  If  one  plays  U  against 
L  ,  his  payoff  is  5,  which  is  better  than  the  4.9  that  one  gets  from  D  . 
However,  playing  D  guarantees  that  1  gets  4.9,  while  if  the  outcome  is  (U,R) 
then  one  gets  0.   And  similarly,  player  two  can  guarantee  4.9  by  playing  R  . 
Yet  if  player  one's  not  sure  that  player  two  might  not  prefer  R  to  L  , 
then   D  could  be  attractive.   And  even  if  player  one  is  sure  of  player  two's 
payoffs,  if  player  one's  not  sure  that  player  two  knows  player  one's  payoffs, 
then  player  one  might  still  fear  that  player  two  will  play  R.  The  point  is  that 
the  logic  of  Nash  equilibrium  relies  on  every  player  knowing  that  every  player 
knows  that  . . .  the  payoffs  are  as  specified.   Technically,  the  payoffs  should 
be  "common  knowledge,"  (as  should  the  Nash  concept  itself.)  The  closer  the 
payoffs  guaranteed  by  D  and  R  come  to  the  equilibrium  payoffs ,  the  more  we 
need  to  insist  on  the  common  knowledge.   Ideally,  equilibria  should  be  subjected 
to  this  sort  of  informal  check  or  "sensitivity  analysis." 

Returning  to  Figure  2,  the  game  there  has  two  pure  strategy  equilibria, 
(U,L)  and  (D,R).   If  there  is  a  reasonable  outcome  in  this  game,  both  players 
must  be  able  to  predict  it,  and  predict  that  their  opponents  will  predict  it, 
and  so  on.   If  players  cannot  so  coordinate  their  expectations,  there  is  no 
reason  to  expect  observed  play  to  correspond  to  either  equilibrium--for  example, 
we  might  see  the  outcome  (U,R).   Not  all  games  have  reasonable  solutions,  and 
on  the  data  given  so  far  this  could  be  one.  However,  Schelling's  (1960)  theory 
of  "focal  points"  suggests  that  in  some  "real  life"  situations  players  may  be 
able  to  coordinate  on  a  particular  equilibrium  by  using  information  that  is 


abstracted  away  in  the  standard  game  formulation.   For  example,  the  names  of 
the  strategies  may  have  some  commonly-understood  "focal"  power.   An  example  is 
two  players  who  are  asked  to  name  an  exact  time,  with  the  promise  of  a  reward 
if  their  choices  match.   Here  "12  noon"  is  focal,  while  "1:43"  is  not.   The 
payoffs  may  also  help  coordinate  expectations.   If  both  players  did  better  with 
(U,L)  then  (D,R),  then  (U,L)  seems  a  natural  outcome  to  expect  one's  opponent 
to  expect  that....   Some  authors  (including  us!)  have  argued  that  if  there  is 
a  unique  Pareto  optimum  among  the  set  of  equilibria,  it  should  be  a  focal  point. 
While  this  intuition  seems  sound  for  two-player  games,  a  recent  example  of 
Bernheim,  Peleg  and  Whinston  shows  that  with  more  than  two  players  the  intuition 
is  suspect.  In  response,  they  have  introduced  the  concept  of 
"coalition-proofness",  which  we  discuss  at  the  end  of  this  section. 

The  idea  of  a  Nash  equilibrium  is  implicit  in  two  of  the  first  games  to 
have  been  formally  studied,  namely  the  Cournot  and  Bertrand  models  of  oligopoly. 
Let  us  emphasize  that  despite  the  common  practice  of  speaking  of  Cournot  and 
Bertrand  equilibrium,  the  models  are  best  thought  of  as  studying  the  Nash 
equilibria  of  two  different  simultaneous  move  games.   In  the  Cournot  model, 
firms  simultaneously  choose  quantities,  and  the  price  is  set  at  the 
market-clearing  level  by  a  fictitious  auctioneer.   In  the  Bertrand  model,  firms 
simultaneously  choose  prices,  and  then  must  produce  to  meet  demand  after  the 
price  choices  become  known.   In  each  model,  firms  choose  best  responses  to  the 
anticipated  play  of  their  opponents. 

For  concreteness ,  we  remind  the  reader  of  the  Cournot  model  of  a  duopoly 
producing  a  homogeneous  good.   Firm  1  and  Firm  2  simultaneously  choose  their 
respective  output  levels,  q.  and  q   from  feasible  sets  F.  .  They  sell  their 
output  at  the  market-clearing  price  p(Q)  ,  where  Q  =  q,  +  q.^  .   Firm  i's 
cost  of  production  is  c.(q.)  ,  and  firm  i's  total  profit  is  then  it  (q   , 


Qo)  =  q-P(Q)  "  c.(q.)  .  The  feasible  sets  F.   and  the  payoff  functions  ir 
determine  the  normal  form  of  the  game;  the  reader  should  check  that  he/she  knows 

how  to  construct  an  equivalent  extensive  form.  The  "Cournot  reaction  functions" 

1  2 

R  (q  )   and  R  (q.)  specify  each  firm's  optimal  output  for  each  fixed  output 

level  of  its  opponent.   If  the  u   are  dif ferentiable  and  strictly  concave, 

and  the  appropriate  boundary  conditions  are  satisfied,  we  can  solve  for  these 

reaction  functions  using  the  first-order  conditions.   The  intersections  of  the 

two  reaction  functions  (if  any  exist)  are  the  Nash  equilibria  of  the  Cournot 

game:   neither  player  can  gain  by  a  change  in  output,  given  the  output  level 

of  its  opponent. 

The  Cournot  game  is  often  contrasted  to  the  situation  in  which  one  firm, 

say  firm  one,  is  a  "Stackelberg  leader"  and  the  other  firm  is  the  "Stackelberg 

follower."  The  Stackelberg  leader  moves  first,  and  chooses  an  output  which  is 

observed  by  the  follower  before  the  follower  makes  its  own  choice.   Thus  the 

Stackelberg  game  is  one  of  perfect  information.   In  the  backwards  induction 

(i.e.  "perfect"  -  see  page  17)  equilibrium  to  this  game,  firm  two's  output  is 

along  its  reaction  curve.   Knowing  this,  firm  one  chooses  its  own  output  to 

2 
maximize  its  payoff  along  the  graph  of  R   .  The  first -order  condition  for  this 

choice  is  that 


8TT\R^(q2),q2)/3q^  +  On^R^q^)  ,q2)/3q2)dR^(qi)/dq^  =  0 


The  backwards -induct ion  equilibrium  to  the  Stackelberg  game  is  called  the 

Stackelberg  equilibrium."  This  terminology  can  be  confusing  to  the  beginner. 
The  Stackelberg  equilibrium  is  not  an  alternative  equilibrium  for  the  Cournot 
game,  but  rather  a  shorthand  way  of  describing  an  equilibrium  of  an  alternative 
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extensive  form.   While  the  prevailing  terminology  is  too  well  established  to 
be  changed,  the  student  will  do  well  to  keep  this  distinction  in  mind. 

The  Cournot  and  Bertrand  models  are  all  static  games,  in  which  firms  make 
their  choices  once  and  for  all.  Section  2A  discusses  a  dynamic  version  of  these 
games.   Also,  even  as  static  games  the  Cournot  and  Stackelberg  models  must  be 
thought  of  as  reduced  forms,  unless  one  literally  believes  in  the  existence  of 
the  price-setting  auctioneer.   Kreps-Scheinkman  (1983)  have  shown  that  the 
auctioneer  in  the  Cournot  model  can  be  replaced  by  a  second  period  in  which 
firms  choose  prices,  taking  their  production  as  fixed  (at  least  if  the  rationing 
scheme  is  "efficient"  and  the  demand  function  is  concave)  .   Thus  in  both  models 
firms  choose  both  prices  and  outputs;  the  difference  is  in  the  timing  of  these 
two  decisions.   (See  Gertner  (1985a)  for  simultaneous  choices.) 

Existence  of  Nash  Equilibria 

We  will  now  take  up  the  question  of  the  existence  of  Nash  equilibria.  Not 
all  games  have  pure-strategy  Nash  equilibria.   A  simple  example  is  "matching 
pennies":   players  one  and  two  simultaneously  announce  either  "heads"  or 
"tails."  If  the  announcements  match,  then  player  one  gains  a  util,  and  player 
two  loses  one.   If  the  announcements  differ,  it  is  player  two  who  wins  the  util, 
and  player  one  who  loses.   If  the  predicted  outcome  is  that  the  announcements 
will  match,  then  player  two  has  an  incentive  to  deviate,  while  player  one  would 
prefer  to  deviate  from  any  prediction  in  which  announcements  do  not  match.  The 
only  "stable"  situation  is  one  in  which  each  player  randomizes  between  his  two 
strategies,  assigning  equal  probability  to  each.   In  this  case  each  player  is 
completely  indifferent  between  his  possible  choices.   A  mixed-strategy  Nash 
equilibrium  is  simply  a  selection  of  mixed  strategies  such  that  no  player 
prefers  to  deviate,  i.e.  the  strategies  must  satisfy  equation  (1).   Since 
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expected  utilities  are  "linear  in  the  probabilities,"  if  a  player  uses  a 
non-degenerate  mixed  strategy  (one  that  puts  positive  weight  on  more  than  one 
pure  strategy)  then  that  player  cannot  strictly  prefer  not  to  deviate--the 
inequality  in  (1)  must  be  weak.   (For  the  same  reason,  it  suffices  to  check  that 
no  player  has  a  profitable  pure-strategy  deviation.)  This  raises  the  question 
of  why  a  player  should  bother  to  play  a  mixed  strategy,  when  he  knows  that  any 
of  the  pure  strategies  in  its  support  would  do  equally  well.   In  matching 
pennies,  if  player  one  knows  that  player  two  will  randomize,  player  one  has  a 
zero  expected  value  from  all  possible  choices.   As  far  as  his  payoff  goes,  he 
could  just  as  well  play  "heads"  with  certainty,  but  if  this  is  anticipated  by 
player  two  the  equilibrium  disintegrates.   Some  authors  have  suggested  that  for 
this  reason  there  is  no  "reasonable"  prediction  for  matching  pennies,  or, 
equivalently,  that  all  possible  probability  mixtures  over  outcomes  are  equally 
reasonable.   (See  e.g.  Bernheim  (1984)  and  Pearce  (1984).)  Harsanyi  (1973) 
followed  by  Aumann  et  al.  (1981)  and  Milgrom-Weber  (1986)  have  offered  the 
defense  that  the  "mixing"  should  be  interpreted  as  the  result  of  small, 
unobservable  variations  in  the  player's  payoffs.  Thus  in  our  example,  sometimes 
player  one  might  prefer  matching  on  T  to  matching  on  H  ,  and  conversely. 
Then  for  each  value  of  his  payoff  player  one  would  play  a  pure  strategy.   This 
"purification"  of  mixed-strategy  equilibria  is  discussed  in  Section  3C .  Despite 
some  controversy,  mixed  strategies  have  been  widely  used  both  in  "pure"  game 
theory  and  in  its  applications  to  industrial  organization. 

One  reason  is  that,  as  shown  by  Nash  (1950),  mixed-strategy  equilibria 
always  exist  in  finite  games  (games  with  a  finite  number  of  nodes,  or, 
equivalently,  a  finite  number  of  normal-form  pure  strategies  per  player  and  a 
finite  number  of  players.) 
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Theorem  (Nash):   Every  finite  n-player  normal  form  game  has  a  mixed-strategy 
equilibrium. 

This  can  be  shown  by  applying  the  Kakutani  fixed-point  theorem  to  the  player's 
reaction  correspondences,  as  we  now  explain.  A  good  reference  for  some  of  the 
technical  details  involved  is  Green-Heller  (1981). 

Define  player   i's   reaction  correspondence,   r  (o)  ,  to  be  the 
correspondence  which  gives  the  set  of  (mixed)  strategies  which  maximize  player 
i's   payoff  when  his  opponents  play  o  .  .   This  is  just  the  natural 
generalization  of  the  Cournot  reaction  functions  we  introduced  above.   Since 
payoffs  are  linear  functions  of  the  mixing  probabilities,  they  are  in  particular 
both  continuous  and  quasiconcave.   This  implies  that  each  player's  reaction 
correspondence  is  non-empty  valued  and  convex-valued.   Moreover,  we  can  show 
that  the  reaction  correspondences  are  "upper  hem i- continuous":   if  o  -^a     and 
o.  Er  (o  )  ,  then  there  is  a  subsequence  of  the  o.    which  converges  to  a 
o.Er  (o)  .   Now  define  the  correspondence  r  to  be  the  Cartesian  product  of 
the  r.  .   This  correspondence  satisfies  the  requirements  of  the  Kakutani 
fixed-point  theorem:   it  maps  a  compact  convex  subset  of  Euclidean  space  (the 
relevant  probability  simplex)  into  its  subsets,  and  it  is  non-empty  valued, 
convex-valued,  and  upper  hemi-continuous .   Hence  r  has  a  fixed  point,  and  by 
construction  the  fixed  points  of   r   are  Nash  equilibria. 

Economists  often  use  models  of  games  with  an  uncountable  number  of  actions. 
Some  might  argue  that  prices  or  quantities  are  "really"  infinitely  divisible, 
while  others  that  "reality"  is  discrete,  and  the  continuum  is  a  mathematical 
abstraction,  but  it  is  often  easier  to  work  with  a  continuum  of  actions  rather 
than  a  large  finite  grid.   Moreover,  as  Dasgupta-Maskin  (1986)  argue,  when  the 
continuum  game  does  not  have  an  equilibrium,  the  equilibria  corresponding  to 
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fine,  discrete  grids  could  be  very  sensitive  to  exactly  which  finite  grid  is 
specified.  These  fluctuations  can  be  ruled  out  if  the  continuum  game  has  an 
equilibrium.   The  existence  of  equilibria  for  infinite  games  is  more  involved 
than  for  finite  ones.   If  payoffs  are  discontinuous  there  may  be  no  equilibria 
at  all.   If  the  payoffs  are  continuous,  then  the  Fan  (1952)  fixed-point  theorem 
can  be  used  to  show  that  a  mixed-strategy  equilibrium  exists.   If  payoffs  are 
quasiconcave  as  well  as  continuous,  then  there  exist  equilibria  in  pure 
strategies,  as  shown  by  Debreu  (1952)  and  Glicksberg  (1952). 

Theorem  (Debreu,  Glicksberg  ,  Fan):  Consider  an  n-player  normal  form  game  whose 
strategy  spaces  S.  are  compact  convex  subsets  of  an  Euclidean  space.  If  the 
payoff  functions  ir  (s)  are  continuous  in  s  ,  and  quasiconcave  in  s.  ,  there 
exists  a  pure-strategy  Nash  equilibrium. 

The  proof  here  is  very  similar  to  that  of  Nash's  theorem:   we  verify  that 
continuous  payoffs  imply  non-empty,  upper  hem i- continuous  reactions,  and  that 
quasiconcavity  in  own  actions  implies  that  reactions  are  convex-valued. 

Theorem  (Glicksberg):   Consider  an  n-player  normal  form  game  (I,S,ii).   If  for 
each  i  ,   S.   is  a  compact  convex  subset  of  a  metric  space,  and  it  is 
continuous,  then  there  exists  a  Nash  equilibrium  in  mixed  strategies. 

Here  the  mixed  strategies  are  the  (Borel)  probability  measures  over  the  pure 

4 
strategies,  which  we  endow  with  the  topology  of  weak  convergence.   Once  more, 


Fix  a  compact  metric  space  A  .   A  sequence  of  measures  y   on  A  converges 

"weakly"  to  a  limit  p  if  /fdy  •*  /fdy  for  every  real -value  continuous  function 

n 

on  A  . 
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the  proof  applies  a  fixed-point  theorem  to  the  reaction  correspondences.   One 
point  to  emphasize  is  that  the  mixed-strategy  payoffs  will  be  quasiconcave  in 
own  actions  even  if  the  pure-strategy  payoffs  are  not.   With  infinitely  many 
pure  strategies,  the  space  of  mixed  strategies  is  infinite-dimensional,  so  a 
more  powerful  fixed-point  theorem  is  required.   Alternatively,  one  can 
approximate  the  strategy  spaces  by  a  sequence  of  finite  grids.   From  Nash  s 
theorem,  each  grid  has  a  mixed-strategy  equilibrium.  One  then  argues  that  since 
the  space  of  probability  measures  is  weakly  compact,  we  can  find  a  limit  point 
of  the  sequence  of  these  discrete  equilibria.   Since  the  payoffs  are  continuous, 
it  is  easy  to  verify  that  the  limit  point  is  an  equilibrium. 

There  are  many  examples  to  show  that  if  payoffs  are  discontinuous  equilibria 
need  not  exist.  Dasgupta-Maskin  argue  that  this  lack  of  existence  is  sometimes 
due  to  payoffs  failing  to  be  quasiconcave,  rather  than  failing  to  be  continuous. 
They  show  if  payoffs  are  quasiconcave,  then  a  pure  strategy  equilibrium  will 
exist  under  a  very  weak  condition  they  call  "graph  continuity."  They  also 
provide  conditions  for  the  existence  of  mixed-strategy  equilibria  in  games 
without  quasiconcave  payoffs.  The  idea  of  their  result  is  to  provide  conditions 
ensuring  that  the  limits  of  the  discrete-grid  equilibria  do  not  have  "atoms" 
(non-negligible  probability)  on  any  of  the  discontinuity  points  of  the  payoff 
functions.  Simon  (1985)  relaxes  their  condition  by  requiring  only  that  at  least 
one  limit  has  this  no-atoms  property,  instead  of  all  of  them. 

A  sizable  literature  has  considered  the  existence  of  pure  strategy 
equilibrium  when  payoffs  are  not  quasiconcave,  particularly  in  the  Cournot 
model.   Without  quasiconcave  payoffs,  the  reaction  functions  can  have  "jumps." 
To  prove  existence  of  equilibrium  in  this  setting  one  must  show  that  the  jumps 
"do  not  matter."  Roberts -Sonnenschein  (1977)  showed  that   "nice"  preferences 
and  technologies  need  not  lead  to  quasiconcave  Cournot  payoffs,  and  provided 
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examples  of  the  non-existence  of  pure-strategy  Cournot  equilibrium.   McManus 
(1962)  and  Roberts -Sonnenschein  (1976)  show  that  pure  strategy  equilibria  exist 
in  symmetric  games  with  real-valued  actions  if  costs  are  convex.   The  key  is 
that  the  convex-cost  assumption  can  be  shown  to  imply  that  all  the  jumps  in  the 
reaction  functions  are  jumps  up.   Novshek  (1985)  has  shown  that  pure-strategy 
equilibria  exist  in  markets  for  a  homogeneous  good  where  each  firm's  marginal 
revenue  is  decreasing  in  the  aggregate  output  of  its  opponents,  for  any 
specification  of  the  cost  functions.   Topkis  (1970)  and  Vives  (1985)  use  a 
fixed-point  theorem  for  non-decreasing  functions  due  to  Tarski  (1955)  to  prove 
the  existence  of  pure-strategy  equilibria  in  games  where  the  reactions  are 
increasing.   Tarski  also  proved  that  a  function  from  (0,1)  to  (0,1)  which  has 
no  downward  jumps  has  a  fixed  point,  even  if  the  function  is  not  everywhere 
non-decreasing.   Vives  uses  this  result  to  give  a  simple  proof  of  the 
McManus/Roberts -Sonnenschein  result.   (In  symmetric  equilibria  each  firm's 
reaction  function  depends  only  on  the  sum  of  its  opponents  actions,  and  all 
firms  have  the  the  same  reaction  function.  Thus  if  the  actions  are  real-valued 
the  second  of  the  Tarski  results  can  be  applied.) 

The  converse  of  the  existence  question  is  that  of  the  characterization  of  the 
equilibrium  set.   Ideally  one  would  prefer  there  to  be  a  unique  equilibrium, 
but  this  is  only  true  under  very  strong  conditions.  When  several  equilibria 
exist,  one  must  see  which,  if  any,  seem  to  be  reasonable  predictions,  but  this 
requires  examination  of  the  entire  Nash  set.   The  reasonableness  of  one 
equilibrium  may  depend  on  whether  there  are  others  with  competing  claims. 
Unfortunately,  in  many  interesting  games  the  set  of  equilibria  is  difficult  to 
characterize. 
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Correlated  Equilibria 

The  Nash  equilibrium  concept  is  intended  to  be  a  minimal  necessary  condition 
for  "reasonable"  predictions  in  situations  where  the  players  must  choose  their 
actions  "independently."   Let  us  return  to  our  story  of  players  who  may  have 
pre-play  discussion,  but  then  must  go  off  to  isolated  rooms  to  choose  their 
strategies.   In  some  situations,  both  players  could  gain  if  they  could  build  a 
"signalling  device"  that  sent  signals  to  the  separate  rooms.   Aumann's  (1974) 
notion  of  a  correlated  equilibrium  captures  what  could  be  achieved  with  any  such 
signals.   (See  Myerson  (1983)  for  a  fuller  introduction  to  this  concept,  and 
for  a  discussion  of  its  relationship  to  the  theory  of  mechanism  design.) 

To  motivate  this  concept,  consider  Aumann's  example,  presented  in  Figure 
7.   This  game  has  three  equilibria:  (U,L),  (D,R),  and  a  mixed-strategy 
equilibrium  that  gives  each  player  2.5.   If  they  can  jointly  observe  a  "coin 
flip"  (or  sunspots,  or  any  other  publicly  observable  random  variable)  before 
play,  they  can  achieve  payoffs  (3,3)  by  a  joint  randomization  between  the  two 
pure-strategy  equilibria.   However,  they  can  do  even  better  (still  without- 
binding  contracts)  if  they  can  build  a  device  that  sends  different,  but 
correlated,  signals  to  each  of  them.  This  device  will  have  three  equally  likely 
states,  A,  B,  and  C.   Player  one's  information  partition  is  (A,(B,C)).   This 
means  that  if  A  occurs,  player  one  is  perfectly  informed,  but  if  the  state 
is  B  or  C  ,  player  one  does  not  know  which  of  the  two  prevails.  Player  two's 
information  partition  is  ((A,B),C).   In  this  transformed  game,  the  following 
is  a  Nash  equilibrium:   player  one  plays  U  when  told  A  ,  and  D  when  told 
(B,C);  player  two  plays  R  when  told  C  ,  and  L  when  told  (A,B).  Let's  check 
that  player  one  does  not  want  to  deviate.   When  he  observes  A  ,  he  knows  that 
two  observes  (A,B),  and  thus  that  two  will  play  L  ;  in  this  case  U  is  player 
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one's  best  response.   If  player  one  observes  (B,C),  then  conditional  on  his 
information  he  expects  player  two  to  play   L  and  R  with  equal  probability. 
In  this  case  player  one  will  average  2.5  from  either  of  his  choices,  so  he  is 
willing  to  choose  D  .   So  player  one  is  choosing  a  best  response;  the  same  is 
easily  seen  to  be  true  for  player  two.  Thus  we  have  constructed  an  equilibrium 
in  which  the  players'  choices  are  correlated:   the  outcomes  (U,L),  (D,L),  and 
(D,R)  are  chosen  with  probability  one-third  each,  while  the  "bad"  outcome  (U,R) 
never  occurs.   In  this  new  equilibrium  the  expected  payoffs  are  3  1/3  each, 
which  is  better  than  in  any  of  the  equilibria  of  the  game  without  the  signalling 
device.   (Note  that  adding  the  signalling  device  does  not  remove  the  "old" 
equilibria:   since  the  signals  do  not  influence  payoffs,  if  player  one  ignores 
his  signal,  player  two  may  as  well  ignore  hers.) 

If  we  had  to  analyze  each  possible  signalling  device  one  at  a  time,  we  would 
never  be  done.   Fortunately,  if  we  want  to  know  what  could  be  done  with  all 
possible  devices,  we  can  dispense  with  the  signals,  and  work  directly  with 
probability  distributions  over  strategies.   In  our  example,  players  need  not 
be  told  about  the  states   A,  B,  and  C  .  They  could  simply  be  given  recommended 
strategies,  as  long  as  the  joint  distribution  over  recommendations  corresponds 
to  the  joint  distribution  over  outcomes  that  we  derived.   Player  one  could  be 
told  "play  D"  instead  of  (B,C),  as  long  as  this  means  there's  a  50-50  chance 
of  player  two  playing  L  . 

Definition:   A  correlated  equilibrium  is  any  probability  distribution  p(s) 

over  the  pure  strategies  S.   x...x  S   such  that,  for  every  player  i  and  every 

function  d.(s.)  that  maps  S .  to  S .  . 
11         *^    1     1  ' 


Ti^(p)  >  Zp(s)TT^(d^(s^),s_.) 
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That  is,  player  i  should  not  be  able  to  gain  by  disobeying  the  recommendation 
to  play  s.   if  every  other  player  obeys  the  recommendations. 


A  pure-strategy  Nash  equilibrium  is  a  correlated  equilibrium  in  which  the 
distribution  p(s)  is  degenerate.   Mixed-strategy  Nash  equilibria  are  also 
correlated  equilibria:  just  take  p(s)  to  be  the  joint  distribution  over  actions 
implied  by  the  equilibrium  strategies,  so  that  the  recommendations  made  to  each 
player  convey  no  information  about  the  play  of  his  opponents. 

Inspection  of  the  definition  shows  that  the  set  of  correlated  equilibria 
is  convex,  so  the  set  of  correlated  equilibria  is  at  least  as  large  as  the  convex 
hull  of  the  Nash  equilibria.   Since  Nash  equilibria  exist  in  finite  games, 
correlated  equilibria  do  too.  Actually,  the  existence  of  correlated  equilibria 
would  seem  to  be  a  simpler  problem  than  the  existence  of  Nash  equilibria, 
because  the  set  of  correlated  equilibria  is  defined  by  a  system  of  linear 
inequalities,  and  is  therefore  convex.   Recently,  Hart  and  Schmeidler  (1986) 
have  provided  an  existence  proof  that  uses  only  linear  methods  (as  opposed  to 
fixed-point  theorems.)   One  might  also  like  to  know  when  the  set  of  correlated 
equilibria  differs  "greatly"  from  the  convex  hull  of  the  Nash  equilibria,  but 
this  question  has  not  yet  been  answered. 

We  take  the  view  that  the  correlation  in  correlated  equilibria  should  be 
thought  of  as  the  result  of  the  players  receiving  correlated  signals,  so  that 
the  notion  of  correlated  equilibrium  is  particularly  appropriate  in  situations 
with  pre-play  communication,  for  then  the  players  might  be  able  to  design  and 
implement  a  procedure  for  obtaining  correlated,  private  signals.  However,  we 
should  point  out  that  Aumann  (1986)  and  Brandenburger-Dekel  (1985b)  argue  that 
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the  correlated  equilibrium  notion  is  more  "natural"  than  the  Nash  one  from  the 
point  of  view  of  subjective  probability  theory. 

Coalition-Proof  Equilibria  and  Strong  Equilibria 

While  no  single  player  can  profitably  deviate  from  a  Nash  equilibrium,  it 
may  be  that  some  coalition  could  arrange  a  mutually  beneficial  deviation.   If 
players  can  engage  in  pre-play  communication,  then  some  coalitions  of  players 
might  hope  to  arrange  for  joint  deviations  from  the  specified  play.  The  notion 
of  a  "strong  equilibrium"  (Aumann  (1959))  requires  that  no  subset  of  players, 
taking  the  actions  of  the  others  as  given,  could  jointly  deviate  in  a  way  that 
benefits  all  of  its  members.  As  this  requirement  applies  to  the  grand  coalition 
of  all  players,   strong  equilibria  are  Pareto-ef f icient .   Because  no 
restrictions  are  placed  on  the  play  of  a  deviating  coalition,  the  conditions 
for  a  strong  equilibrium  are  quite  stringent,  and  these  equilibria  fail  to  exist 
in  many  games  of  interest  for  industrial  organization,  such  as,  for  example, 
Cournot  oligopoly.  Recently,  Bernheim,  Peleg,  and  Whinston  (1986)  (B-P-W)  have 
proposed  the  idea  of  a  "coalition-proof"  equilibrium  ,  which,  they  argue,  is  a 
more  natural  way  to  take  account  of  coalitional  deviations. 

The  best  way  to  explain  their  concept  is  to  use  their  example,  which  also 
serves  the  important  function  of  showing  why  the  criterion  of  Pareto-dominance 
may  not  be  a  good  way  to  select  between  equilibria  when  there  are  more  than  two 
players.  In  Figure  10,  player  one  chooses  rows,  player  two  chooses  columns, 
and  player  three  chooses  matrices.  This  game  has  two  pure-strategy  Nash 
equilibria,  (U,L,A)  and  (D,R,B),  and  an  equilibrium  in  mixed  strategies .  B-P-W 
do  not  consider  mixed  strategies,  so  we  will  temporarily  restrict  attention  to 
pure  ones.  The  equilibrium  (U,L,A)  Pareto-dominates  (D,R,B).  Is  (U,L,A)  then 
the  obvious  focal  point?  Imagine  that  this  was  the  expected  solution,  and  hold 
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player  three's  choice  fixed.   This  induces  a  two-player  game  between  players 
one  and  two.   In  this  two-player  game,  (D,R)  is  the  Pareto-dominant  equilibrium! 
Thus,  if  players  one  and  two  expect  that  player  three  will  play  A,  and  if  they 
can  coordinate  their  play  on  their  Pareto-pref erred  equilibrium  in  matrix  A, 
they  should  do  so,  which  would  upset  the  "good"  equilibrium  (U,L,A). 

The  definition  of  a  coalition-proof  equilibrium  proceeds  by  induction  on 
the  coalition  size.   First  one  requires  that  no  one-player  coalition  can 
deviate,  i.e.  that  the  given  strategies  are  a  Nash  equilibrium.   Then  one 
requires  that  no  two-player  deviation  can  deviate,  given  that  once  such  a 
deviation  has  "occurred",  either  of  the  deviating  players  (but  none  of  the 
others)  is  free  to  deviate  again.   That  is,  the  two-player  deviations  must  be 
Nash  equilibria  of  the  two-player  game  induced  by  holding  the  strategies  of  the 
others  fixed.  And  one  proceeds  in  this  way  up  to  the  coalition  of  all  players. 
Clearly  (U,L,A)  in  Figure  10  is  not  coalition-proof;  brief  inspection  shows  that 
(D,R,B)  is.   However,  (D,R,B)  is  not  Pareto-optimal,  and  thus  is  not  a  strong 
equilibrium;  no  strong  equilibrium  exists  in  this  game. 

The  idea  of  coalition-proofness  is  an  interesting  way  to  try  to  model  the 
possibility  of  coalitional  deviations.   However,  the  assumption  that  only 
subsets  of  the  deviating  coalitions  can  be  involved  in  further  deviations  can 
be  questioned,  and  the  general  properties  of  the  concept  are  unknown.  For  these 
reasons,  and  because  coalition-proof  equilibria  need  not  exist  (even  with  mixed 
strategies),  we  feel  that  at  this  time  the  B-P-W  paper  is  more  important  for 
the  issues  it  raises  than  for  its  solution  concept.  We  should  mention  here  that 
Bernheim-Whinston  (1986)  apply  coalition-proofness  to  several  well-known  games 
with  interesting  results. 

2.   Dynamic  Games  of  Complete  Information 
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Most  of  the  examples  in  the  last  section  were  static  games:  each  player's 
choice  of  actions  was  independent  of  the  choices  of  his  opponents.  Many  of  the 
interesting  strategic  aspects  of  the  behavior  of  firms  are  best  modelled  with 
dynamic  games,  in  which  players  can  observe  and  respond  to  their  opponents' 
actions.   This  is  true  not  only  of  inherently  dynamic  phenomena  such  as 
investment,  entry  deterrence,  and  exit,  but  also  of  the  determination  of  price 
and  output  in  a  mature  market.   Section  2  discusses  a  few  special  kinds  of 
dynamic  games  that  have  been  frequently  used  in  the  study  of  oligopoly  theory. 
These  are  all  games  of  complete  information,  i.e.  the  payoff  functions  are 
common  knowledge.   Section  3  discusses  games  of  incomplete  information,  which 
have  become  increasingly  common  in  the  literature. 

Subgame  Perfection 

In  dynamic  games  a  question  arises  that  is  not  present  in  static  ones: 
What  beliefs  should  players  have  about  the  way  that  their  current  play  will 
affect  their  opponents'  future  decisions?  Recall  that  the  game  in  Figure  1  had 
two  Nash  equilibria,  (D,R)  and  (U,L).  We  argued  that  (U,L)  was  unreasonable, 
because   L  was  dominated  by  R  for  player  two.   Alternatively,  we  arrived 
at  (D,R)  as  our  prediction  by  working  backwards  through  the  tree.   Another  way 
of  putting  this  is  that  player  one  should  not  be  deterred  from  playing  D   by 
the  "threat"  of  player  two  playing  L  ,  because  if  player  two's  information  set 
was  actually  reached,  two  would  back  off  from  his  "bluff"  and  play  D  .   This 
approach  is  useful  for  thinking  about  situations  in  which  backwards  induction 
and/or  weak  dominance  arguments  do  not  give  sharp  conclusions.  Selten's  (1965) 
notion  of  a  subgame-perfect  equilibrium  generalizes  the  backwards -induct ion 
idea  to  rule  out  empty  threats  in  more  general  situations. 
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Subgame -perfect  equilibrium  strategies  must  yield  a  Nash  equilibrium,  not 
just  in  the  original  game,  but  in  every  one  of  its  "proper  subgames."  We'll 
define  this  more  formally  in  Section  3,  but  for  now  think  of  a  proper  subgame 
as  a  subset  of  the  initial  game  tree  which:    1)  is  closed  under  succession--if 
a  node  is  in  the  subgame,  so  are  all  of  its  successors;  2)  "respects  information 
sets"  which  means  roughly  that  all  of  the  information  sets  of  the  subgame  are 
information  sets  of  the  initial  game;  and  3)  begins  with  an  information  set  that 
contains  only  one  node.   This  last  requirement  is  in  a  general  sense  very 
restrictive,  which  is  one  of  the  motivations  for  the  various  refinements  of  the 
perfection  concept.   However,  most  of  the  games  we  discuss  in  this  section  are 
"deterministic  multi-period  games,"  which  have  a  very  simple  structure  that 
makes  subgame-perfection  a  useful  tool.   These  games  have  extensive  forms  that 
can  be  divided  into  periods  so  that:  (1)   at  the  start  of  the  kth  period  all 
play  in  periods  1  through  (k-1)  is  common  knowledge,  (the  initial  information 
sets  in  each  period  are  all  singletons);  and  (2)  no  information  set  contained 
in  the  kth  period  provides  any  knowledge  of  play  within  that  period.   Any  game 
of  perfect  information  is  a  multi-period  game:   just  take  all  the  successors 
of  the  initial  nodes  to  belong  to  period  1,  their  successors  to  period  2,  and 
so  on.  The  Cournot  and  Bertrand  models  are  1-period  games.   If  the  same  players 
play  a  Cournot  game  twice  in  a  row,  and  all  players  observe  the  "first-period" 
quantities  before  making  their  second  choice,  we  have  a  two-period  game. 

In  a  multi-period  game,  the  beginning  of  each  period  marks  the  beginning 
of  a  new  subgame.  Thus  for  these  games  we  can  rephrase  subgame-perfection  as 
simply  the  requirement  that  the  strategies  yield  a  Nash  equilibrium  from  the 
start  of  each  period. 

Figure  5  is  actually  the  game  Selten  used  to  introduce  subgame  perfection. 
Here  there  are  two  proper  subgames:   the  whole  game,  and  the  game  beginning 
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in  the  second  "period"  if  one  played  D  .   In  this  subgame,  the  only  Nash 
equilibrium  is  for  player  two  to  choose  L  ,  so  that  any  subgame  perfect 
equilibrium  must  prescribe  this  choice,  and  only  (D,L)  is  subgame-perfect .  More 
generally,  in  any  game  of  perfect  information  subgame  perfection  yields  the  same 
answer  as  backwards  induction.   In  finite-period  simultaneous  move  games, 
subgame-perfection  does  "backwards  induction"  period  by  period:   at  the  last 
period,  the  strategies  must  yield  a  Nash  equilibrium,  given  the  history.   Then 
we  replace  the  last  period  with  the  possible  last-period  equilibria,  and  work 
backwards.   For  example,  a  subgame-perfect  equilibrium  of  a  two-period  Cournot 
model  must  yield  Cournot  equilibrium  outputs  in  the  second  period,  regardless 
of  first-period  play.   Caution:   if  there  are  several  Cournot  equilibria,  then 
which  of  them  prevails  in  the  second  period  can  depend  on  first-period  play. 
We  will  say  more  about  this  when  we  discuss  Benoit-Krishna  (1985). 

2A.   Repeated  Games  and  "implicit  Collusion" 
Infinitely  Repeated  Games 

Chamberlin  (1956)  criticized  the  Cournot  and  Bertrand  models  of  oligopoly 
for  assuming  that  firms  were  myopic.   He  argued  that  in  an  industry  with  few, 
long-lived  firms,  firms  would  realize  their  mutual  interdependence  and  thus  play 
more  "cooperatively"  than  the  Cournot  and  Bertrand  models  suggested.  The  theory 
of  repeated  games  provides  the  simplest  way  of  thinking  about  the  effects  of 
long-term  competition. 

This  theory  shows  that,  under  the  proper  circumstances,  Chamberlin' s 
intuition  can  be  partially  formalized.   Repetition  can  allow  "cooperation"  to 
be  an  equilibrium,   but  it  does  not  eliminate  the  "uncooperative"  static 
equilibria,  and  indeed  can  create  new  equilibria  which  are  worse  for  all  players 
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than  if  the  game  had  been  played  only  once.   Thus  to  complete  the  Chamberlin 

argument,  one  must  argue  that  the  "cooperative"  equilibria  are  "reasonable." 

In  an  infinitely  repeated  game,  players  face  the  same  constituent  game  in 

each  of  infinitely  many  periods.   There  is  no  direct  physical  link  between  the 

periods;  each  period's  feasible  actions  and  per-period  payoffs  are  exactly  as 

.  in  the  constituent  game.   This  rules  out  important  phenomena  such  as  investment 

in  productive  machinery,  so  few  interesting  industries  can  be  modelled  exactly 

as  repeated  games.   Nevertheless,  if  the  history-dependent  aspects  of  the 

industry  are  not  too  important,  the  repeated  game  model  may  be  a  reasonable 

approximation.   Also,  many  of  the  qualitative  predictions  about  the  importance 

of  repeated  play  and  the  nature  of  equilibria  are  useful  in  thinking  about  more 

general  dynamic  games,  as  we  discuss  in  Section  2B.  Of  course,  the  main  reason 

that  repeated  games  have  received  so  much  attention  is  their  simplicity. 

The  Constituent  Game  g  is  a  finite  n-player  game  in  normal  form, 

(I,I,Tr)  where  Z.   is  the  probability  distributions  over  a  finite  set   S.   of 

pure  strategies.   In  the  repeated  version  of  g  ,  each  player  i's   strategy 

is  a  sequence  of  maps   (o.(t))   mapping  the  previous  actions  of  all  players  to 

a  0.  E  I.  .   Let  us  stress  that  it  is  the  past  actions  that  are  observable, 
11  ^    ' 

and  not  past  choices  of  mixed  strategies. 

Players  maximize  the  average  discounted  sum  of  their  per-period  payoffs  with 
common  discount  factor  6  .   (We  use  the  average  discounted  sum  rather  than 
simply  the  sum  so  that  payoffs  in  the  one-shot  and  repeated  games  are 
comparable--if  a  player  receives  payoff  5  every  period  his  average  discounted 
payoff  is  5,  while  the  discounted  sum  is,  of  course,   5/(1-6)  . 
Player  i's  reservation  utility  is 
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v.*  =  rain  max  ir  (.a .  ,a    .) 
1  1  -1 


0  .  0. 

-1  1 


In  any  equilibrium  of  the  repeated  game,  player  i's   strategy  must  be  a  best 
response  to  the  strategies  of  his  opponents.   One  option  player   i  has  is  to 
play  rayopically  in  each  period,  that  is  to  play  to  maximize  that  period  s 
payoff,  ignoring  the  way  this  influences  his  opponents'  future  play.   This 
static  maximization  will  give  player  i  at  least   v.*  in  each  period,  so  that 
in  any  equilibrium,  player  i's  expected  average  payoff  must  be  at  least 
v.*  .   A  payoff  vector  v  is  individually  rational  if  for  all  players  v.  > 


V." 

1 


Notice  that  the  equilibria  of  the  constituent  game  (the  "static  equilibria") 
remain  equilibria  if  the  game  is  repeated:  If  each  player's  play  is  independent 
of  the  past  history,  then  no  player  can  do  better  than  to  play  a  static  best 
response.  Notice  also  that  if  the  discount  factor  is  very  low,  we'd  expect  that 
the  static  equilibria  are  the  only  equilibria--if  the  future  is  unimportant, 
then  once  again  players  will  choose  static  best  responses.  (This  relies  on  g 
being  finite.) 

The  best-known  result  about  repeated  games  is  the  celebrated  "folk  theorem." 
This  theorem  asserts  that  if  the  game  is  repeated  infinitely  often  and  players 
are  sufficiently  patient,  then  "virtually  anything"  is  an  equilibrium  outcome. 
By  treating  the  polar  case  of  extreme  patience,  the  folk  theorem  provides  an 
upper  bound  for  the  effects  of  repeated  play,  and  thus  a  benchmark  for  thinking 
about  the  intermediate  case  of  mild  impatience. 

The  oldest  version  of  the  folk  theorem  asserts  that  if  players  are 
sufficiently  patient  (the  discount  factors  are  near  enough  to  one)  then  any 
feasible  individually  rational  payoffs  are  supportable  by  a  Nash  equilibrium. 
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The  idea  of  the  proof  is  simple:   any  deviation  from  the  prescribed  path  by 
player  i  leads  the  other  players  to  play  to  "minmax"  him  (i.e.,  using  the 
strategies  that  attain  the  minimum  in  the  definition  of  v."''  )  for  the  rest  of 
the  game.   In  a  repeated  Cournot  game,  this  would  correspond  to  all  players 
choosing  the  largest  possible  output  forever.   Given  this  threat,  players  will 
indeed  choose  not  to  deviate  as  long  as 

(1)  never  deviating  yields  more  than  v.*  ,  and 

(2)  the  discount  factor  is  large  enough  that  the  gains  to  any 
one-period  deviation  are  outweighed  by  the  never  ending 

( "grim" )  punishment . 

The  strategies  sketched  above  clearly  need  not  be  subgame  perfect--no  firm 
would  choose  to  produce  a  huge  amount  if  the  market  price  were  zero!   However, 
the  "perfect  folk  theorem"  shows  that  the  same  outcome  can  be  enforced  by  a 
perfect  equilibrium,  so  that  restricting  attention  to  perfect  equilibria  does 
not  reduce  the  limit  set  of  equilibrium  payoffs .   (It  does,  of  course,  rule  out 
some  Nash  equilibria.) 

Friedman  (1971)  proved  a  weaker  version  of  this  theorem  which  showed  that 
any  payoffs  better  for  all  players  than  a  Nash  equilibrium  of  the  constituent 
game  are  the  outcome  of  a  perfect  equilibrium  of  the  repeated  game,  if  players 
are  sufficiently  patient.   The  desired  play  is  enforced  by  the  "threat"  that 
any  deviation  will  trigger  a  permanent  switch  to  the  static  equilibrium. 
Because  this  "punishment"  is  itself  a  perfect  equilibrium,  so  are  the  overall 
strategies.   This  result  shows,  for  example,  that  patient,  identical,  Cournot 
duopolists  can  "implicitly  collude"  by  each  producing  one-half  the  monopoly 
output,  with  any  deviation  triggering  a  switch  to  the  Cournot  outcome.   This 
would  be  collusive"  in  yielding  the  monopoly  price.   The  collusion  is 
"implicit"  (or  "tacit")  in  that  the  firms  would  not  need  to  enter  into  binding 
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contracts  to  enforce  their  cooperation.   Instead,  each  firm  is  deterred  from 
breaking  the  agreement  by  the  (credible)  fear  of  provoking  Cournot  competition. 
If  this  equilibrium  is  suitably  "focal,"  as  it  might  be  with  two  identical 
firms,  then  the  firms  might  be  able  to  collude  without  even  communicating!   This 
possibility  has  grave  implications  for  anti-collusion  laws  based  on  observed 
conduct.   How  could  two  non-communicating  firms  be  charged  with  conspiracy? 

Whether  collusion  can  be  enforced  in  a  particular  oligopoly  then  depends 
on  whether  the  "relevant"  discount  factor  is  sufficiently  large.  This  discount 
factor  measures  the  length  of  the  observation  lag  between  periods,  "as  well  as 
the  player's  impatience  "per  unit  time."  In  a  market  where  orders  are  large 
but  infrequent,  a  single  order  might  represent  several  years  of  full-time 
production.  Here  the  short-run  gains  to  cheating  might  well  outweigh  the  costs 
of  (greatly  delayed)  punishments.   In  the  other  extreme,  with  frequent,  small 
orders,  implicit  collusion  is  more  likely  to  be  effective. 

The  Friedman  result  is  weaker  than  the  folk  theorem  because  of  its 
requirement  that  both  players  do  better  than  in  a  static  equilibrium.   As  a 
Stackelberg  follower's  payoffs  are  worse  than  a  Cournot  duopolist's,  Friedman's 
result  does  not  show  that  the  Stackelberg  outcome  can  be  enforced  in  a  repeated 
Cournot  game.  That  this  is  however  true  is  shown  in  the  "perfect  folk  theorems" 
of  Aumann-Shapley  (1976),  Rubinstein  (1979),  and  Fudenberg-Maskin  (1986a). 
Aumann-Shapley  and  Rubinstein  consider  the  no-discounting  models  in  which 
players  are  "completely"  patient.   Fudenberg-Maskin  show  that,  under  a  mild 
"full-dimensionality"  condition,  the  result  continues  to  hold  if  the  discount 
factors  are  sufficiently  close  to  one.   They  also  strengthen  earlier  results 
by  allowing  players  to  use  mixed  strategies  as  punishments.  Aumann-Shapley  and 
Rubinstein  had  restricted  attention  to  pure  strategies,  which  leads  to  higher 
individually-rational  payoff  levels,  and  thus  a  weaker  theorem.   (Their  work 
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can  also  be  interpreted  as  allowing  mixed  strategies  as  long  as  the  mixing 
probabilities  themselves,  and  not  just  the  actions  actually  chosen,  are 
observable  at  the  end  of  each  period.) 

One  might  wish  to  characterize  the  set  of  perfect  equilibria  when  there 
is  "substantial"  impatience.   A  fascinating  paper  by  Abreu  (1984)  provides  a 
tool  for  this  purpose.   (See  also  Harris  (1986),  who  gives  a  clearer  exposition 
and  simpler  proofs  of  Abreu 's  results.)   Call  strategies  "simple"  if  they  have 
the  following  form:  there  is  an  "equilibrium  path"  and  n  "punishment  paths," 
one  for  each  player.   Play  follows  the  equilibrium  path  as  long  as  no  one  has 
deviated.   If  player  i  was  the  most  recent  player  to  deviate,  and  did  so  at 
period  t  ,  then  play  at  period   (t+k)   is  given  by  the  k    element  of  the 
"punishment  path"  corresponding  to  player  i  .   (What  happens  if  two  or  more 
players  deviate  simultaneously  is  irrelevant.)  The  force  in  the  restriction 
to  simple  strategies  is  that  player  i's  punishment  path  is  independent  of  the 
history  before  i's  deviation  and  also  of  the  nature  of  the  deviation  itself. 
Simple  strategies  are  optimal  if  each  player's  average  discounted  utility  at 
the  beginning  of  his  punishment  phase  is  the  lowest  payoff  he  receives  in  any 
perfect  equilibrium. 

As  the  set  of  equilibria  is  closed  (Fudenberg-Levine  [1983]),  there  is  a 
worst  perfect  equilibrium  w(i)  for  each  player  i  .  Any  equilibrium  path  that 
can  be  enforced  by  the  threat  that  player  i's  deviations  will  be  punished  by 
switching  to  some  equilibrium  can  clearly  be  enforced  by  the  threat  player  i's 
deviations  will  be  punished  by  switching  to  w(i)  .  Therefore,  as  Abreu  shows, 
optimal  simple  strategies  exist,  and  any  perfect  equilibrium  outcome  can  be 


Simultaneous  deviations  can  be  ignored,  because  in  testing  for  Nash  or 
subgame-perfect  equilibria,  we  ask  only  if  a  player  can  gain  by  deviating  when 
his  opponents  play  as  originally  specified. 
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enforced  by  such  strategies.   Thus,  to  characterize  the  set  of  equilibria  in 
any  game  it  suffices  to  find  the  worst  possible  perfect  equilibrium  payoffs  for 
each  player.   In  general  this  may  be  a  difficult  problem,  but  the  set  of 
symmetric  equilibria  of  symmetric  games  is  more  easily  characterized. 
[Caution  --  symmetry  here  requires  not  only  that  the  payoffs  along  the 
equilibrium  path  be  identical,  but  that  the  payoffs  be  identical  in  the 
punishment  phases  as  well.]   Abreu's  thesis  (1983)  uses  the  idea  of  optimal 
simple  strategies  to  characterize  the  symmetric  equilibria  of  repeated  Cournot 
games.   Shapiro's  essay  in  this  Handbook  explains  this  characterization  in 
detail . 

Another  case  in  which  the  lowest  perfect  equilibrium  payoffs  can  be  pinned 
down  is  when  equilibria  can  be  constructed  that  hold  players  to  their 
reservation  values.  Fudenberg-Maskin  (1987)  provide  conditions  for  this  to  be 
true  for  a  range  of  discount  factors  between  some  6  and  1  .  Because  the 
reservation  values  are  of  course  the  worst  possible  punishments,  any  equilibrium 
outcome  (Nash  or  perfect)  can  be  enforced  with  the  threat  that  deviations  will 
switch  play  to  an  equilibrium  in  which  the  deviator  is  held  to  his  reservation 
value. 
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Repeated  Games  with  Imperfect  Monitoring 

One  drawback  of  repeated  games  as  a  model  of  collusion  is  that  they  do  not 
explain  price  wars:   In  equilibrium,  no  firm  ever  deviates.  This  lack  motivated 
the  Green-Porter  (1984)  model  of  "Noncooperative  Collusion  under  Imperfect 
Price  Information."  The  Green-Porter  model  is  an  infinitely  repeated 
quantity-setting  game  in  which  firms  do  not  observe  the  outputs  of  their 
opponents.   Instead,  firms  only  observe  the  market  price,   p(Q,6)  ,  which  is 
determined  by  aggregate  output  Q  and  a  stochastic  disturbance,   8  .   The 
6's   in  the  different  periods  are  identically  and  independently  distributed 
according  to  a  density  f(6)  ,  which  is  such  that  the  set  of  possible  prices 
(the  support  of  p(Q,e))  is  independent  of  Q  .   All  firms  are  identical,  and 
there  is  a  symmetric  ("Cournot")  equilibrium  of  the  constituent  game  in  which 
each  firm  produces  output  q   .   As  with  ordinary  repeated  games,  one 
equilibrium  of  the  repeated  game  is  for  all  firms  to  produce  q   each  period. 
Could  the  firms  hope  to  improve  on  this  outcome  if  they  are  patient? 

Green-Porter  show  that  they  can,  by  constructing  a  family  of  "trigger-price" 
equilibria  of  the  following  form:  Play  begins  in  the  "cooperative"  phase,  with 
each  firm  producing  some  output  q*  .   Play  remains  in  the  cooperative  phase 
as  long  as  last  period's  price  exceeded  a  trigger  level  p*  •   If  the  price  falls 
below  p*  ,   firms  switch  to  a  "punishment  phase"  in  which  each  firm  produces 
output  q   .   Punishment  lasts  for  T  periods,  after  which  play  returns  to  a 
cooperative  phase.   For  a  triple   (q",P",T)   to  generate  a  Nash  equilibrium, 
each  firm  must  prefer  not  to  cheat  in  either  phase.   Since  q   is  a  static 
equilibrium,  no  firm  will  cheat  in  the  punishment  phases,  so  we  need  only  check 
the  cooperative  phase.   Setting  q"  =  q   results  in  a  trivial  trigger-price 
equilibrium.   If  the  firms  are  somewhat  patient  they  can  do  better  by  setting 

q*  <  q   .   In  such  an  equilibrium,   p  must  be  high  enough  that  punishment 
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occurs  with  positive  probability.   Otherwise,  a  firm  could  increase  its  output 
slightly  in  the  cooperative  phase  without  penalty.  Thus  punishment  will  occur 
even  if  no  firm  has  deviated.   On  seeing  a  low  price,  all  firms  expand  their 
output  not  out  of  concern  that  an  opponent  has  cheated,  but  rather  in  the 
knowledge  that  if  low  prices  did  not  sometimes  trigger  punishment,  then  their 
collusive  scheme  would  not  be  self-enforcing.   (See  Rotemberg-Saloner  (1986) 
for  a  repeated  game  model  with  perfect  monitoring  in  which  price  wars  are 
voluntary. ) 

The  trigger-price  equilibria  constructed  by  Green-Porter  have  an  appealing 
simplicity,  but  they  need  not  be  optimal--other  equilibria  may  yield  higher 
expected  payoffs  (for  the  firms);  Abreu-Pearce-Stacchetti  (1986)  investigated 
the  structure  of  the  optimal  symmetric  equilibria  in  the  Green-Porter  model. 
In  the  process,  they  develop  a  tool  which  is  useful  for  analyzing  all  repeated 
games  with  imperfect  monitoring.  This  tool,  which  they  call  "self-generation," 
is  extended  in  their  1986  paper. 

Self-generation  is  a  sufficient  condition  for  a  set  of  payoffs  to  be 
supportable  by  equilibria.   It  is  the  multi-player  generalization  of  dynamic 
programming' s  principle  of  optimality,  which  provides  a  sufficient  condition 
for  a  set  of  payoffs,  one  for  each  state,  to  be  the  maximal  net  present  values 
obtainable  in  the  corresponding  states.   Abreu-Pearce-Stacchetti' s  insight  is 
that  the  "states"  need  not  directly  influence  the  player's  payoffs,  but  can 
instead  reflect  (in  the  usual  self-confirming  way)  changes  in  the  play  of 
opponents.   Imagine  for  example  that,  in  the  Green-Porter  model,  there  are  only 
three  possible  values  of  the  market  price  --  p  >  p  >  p     Price  p.  occurs 
with  probability  m.(Q)  ,  where  Q  is  total  industry  output.   Note  that  past 
prices  do  not  directly  influence  current  payoffs  or  transition  probabilities. 
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Nevertheless,  we  can  construct  equilibrium  strategies  that  use  the  realized 
prices  to  determine  the  transitions  between  "fictitious"  states. 

For  example,  imagine  that  we  are  told  that  there  are  two  fictitious  states 
a  and  b  ,  with  associated  payoffs  for  both  firms  of   u   and  u  .   (We  will 
look  at  symmetric  equilibria;  otherwise  we  would  need  to  specify  each  firm  s 
payoffs.)  We  are  also  given  the  following  transition  rule:   the  state  switches 
from  a  to  b  if  p   occurs,  remaining  at  a  if   P  =  P,   or  p   .   State 
b  is  absorbing:   once  it  is  reached,  it  prevails  from  then  on.   As  we  will  see, 
state  b  corresponds  to  an  infinite  "punishment  phase"  in  Green-Porter.   The 
values  u  are  self-generating  if,  in  each  state  i=a,b,  when  players  believe 
that  their  future  payoffs  are  given  by  u  ,  there  is  an  equilibrium  s.   in 
current  actions  with  average  (over  current  and  future  payoffs)  payoff  u.  . 
In  the  language  of  dynamic  programming,  this  says  that  for  each  player  the 
payoff  u.   is  unimprovable,  given  the  specified  continuation  payoffs  and  his 
opponents'  current  actions.   To  show  that  self-generating  payoffs  are 
sustainable  by  Nash  equilibria,  we  first  must  define  strategies  for  the  players. 
To  do  this,  trace  out  the  succession  of  single-period  equilibria,  i.e.   if  play 
begins  in  state  a  ,  and  p   occurs  in  the  first  period,  the  state  is  still 
a  ,  so  the  second-period  outputs  are  again  given  by  the  s   .  By  construction, 
no  player  can  gain  by  deviating  from  strategy  s.   in  state   i   for  one  period 
and  then  reverting  to  them  thereafter.   The  standard  dynamic  programming 
argument  then  shows  that  unimprovability  implies  optimality:   By  induction,  no 
player  can  improve  on  u   or  u,   by  any  finite  sequence  of  deviations,  and  the 
payoff  to  an  infinite  sequence  of  deviations  can  be  approximated  by  finitely 
many  of  them.   In  our  example,  since  state  b  is  absorbing,  for   (u  ,  u,  )   to 

a.  D 

be  self-generating,  u   must  be  self-generating  as  a  singleton  set.  This  means 
that  u,   must  be  the  payoffs  in  a  static  equilibrium,  as  in  Green-Porter's 
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punishment  phase.   In  state  a  ,  today's  outcome  influences  the  future  state, 
so  that  players  have  to  trade  off  their  short  run  incentive  to  deviate  against 
the  risk  of  switching  to  state  b  .  Thus  state  a  corresponds  to  a  "cooperative" 
phase,  where  players  restrict  output  to  decrease  the  probability  of  switching 
to  the  punishment  state. 

The  self-generation  criterion  not  only  provides  a  way  of  testing  for 
equilibria,  it  also  suggests  a  way  of  constructing  them:   one  can  construct 
state  spaces  and  transition  rules  instead  of  working  directly  with  the  strategy 
spaces.   Fudenberg-Maskin  (1986b)  use  this  technique  to  investigate  when  "folk 
theorems"  obtain  for  repeated  games  with  imperfect  monitoring. 

Returning  to  the  topic  of  implicit  collusion  in  oligopolies,  what  lessons 
do  we  learn  from  the  study  of  repeated  games?  First,  repetition  matters  more, 
and  (privately)  efficient  outcomes  are  more  likely  to  be  equilibria,  when  the 
periods  are  short.   Second,  more  precise  information  makes  collusion  easier  to 
sustain,  and  lowers  the  costs  of  the  occasional  "punishments"  which  must  occur 
to  sustain  it.   Third,  firms  will  prefer  "bright-line"  rules  which  make 
"cheating"  easy  to  identify.   For  example,  firms  would  like  to  be  able  to 
respond  to  changes  in  market  conditions  without  triggering  "punishment . " 
Scherer  (1980)  suggests  that  the  institutions  of  price  leadership  and  mark-up 
pricing  may  be  responses  to  this  problem.   (See  also  Rotemberg-Saloner  (1985), 
who  explain  how  price  leadership  can  be  a  collusive  equilibrium  with 
asymmetrically-informed  firms.) 

While  most  applications  of  repeated  games  have  been  concerned  with  games 
with  infinitely  lived  players,  "implicitly  collusive  equilibria"  can  arise  even 
if  all  the  players  have  finite  lives,  as  long  as  the  model  itself  has  an  infinite 
horizon.   Let  us  give  two  examples.   First,  a  finitely  lived  manager  of  a  firm 
becomes  the  equivalent  of  an  infinitely  lived  player  if  he  owns  the  firm. 
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because  the  latter 's  value  depends  on  the  infinite  streams  of  profits  (as  in 
Kreps  (1985)).   Second,  overlapping  generations  of  finite  lived  players  can 
yield  some  cooperation  between  the  players.   A  player  who  cheats  early  in  his 
life  will  be  punished  by  the  next  generation,  which  in  turn  will  be  punished 
by  the  following  generation  if  it  does  not  punish  the  first  player,  etc.  (Cremer 
(1983)). 

We  conclude  this  section  with  three  warnings  on  the  limitations  of  the 
repeated  game  model.   First,  by  focusing  on  stationary  environments,  the  model 
sidesteps  the  questions  of  entry  and  entry  deterrence.   These  questions  can  in 
principle  be  studied  in  games  whose  only  time-varying  aspect  is  the  number  of 
entrants,  but  serious  treatments  of  the  entry  process  more  naturally  allow  for 
factors  such  as  investment.   Second,  because  repetition  enlarges  the  set  of 
equilibria,  selecting  an  equilibrium  becomes  difficult.  If  firms  are  identical, 
an  equal  division  of  the  monopoly  profits  seems  an  obvious  solution;  however 
if  one  complicates  the  model  by,  for  instance,  introducing  a  prior  choice  of 
investment,  most  subgames  are  asymmetric,  and  the  quest  for  a  focal  equilibrium 
becomes  harder.  However,  the  selection  criterion  of  picking  a  date-zero  Pareto 
optimal  equilibrium  outcome  is  not  "meta-perfect":   date-zero  Pareto  optimal 
outcomes  are  typically  enforced  by  the  threat  of  switching  to  a  non-Pareto 
optimal  outcome  if  some  player  deviates.   Just  after  the  deviation,  the  game 
is  formally  identical  to  the  period-zero  game,  yet  it  is  assumed  that  players 
will  not  again  coordinate  on  the  focal  Pareto-optimal  outcome.  Third,  implicit 
collusion  may  not  be  enforceable  if  the  game  is  repeated  only  finitely  many 
times.   What  then  should  we  expect  to  occur  in  finite -lived  markets? 

Finite-Horizon  Games 
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Infinite-horizon  repeated  games  are  used  as  an  idealization  of  repeated 
play  in  long-lived  markets.  Since  actual  markets  are  finite-lived,  one  should 
ask  whether  the  infinite-horizon  idealization  is  sensible.  One  response  is  that 
we  can  incorporate  a  constant  probability  y  of  continuing  to  the  next  period 
directly  into  the  utility  functions  :  the  expected  present  value  of  ten  utils 
tomorrow,  if  tomorrow's  utils  are  discounted  by  6,  and  tomorrow  arrives  with 
probability   v,   is  simply  Sy. 

Then  if  both  6   and  y  are  near  to  one  the  folk  theorem  applies.   This 
specification  implies  that  the  game  ends  in  finite  time  with  probability  one, 
but  there  is  still  a  positive  probability  that  the  game  exceeds  any  fixed  finite 
length.   Thus  one  may  ask  what  the  theory  predicts  if  the  game  is  certain  to 
end  by  some  very  far-distant  date.   It  is  well  known  that  in  some  games  the 
switch  from  an  infinite  horizon  to  a  long  finite  one  yields  dramatically 
different  conclusions --the  set  of  equilibrium  payoffs  can  expand 
discontinuous ly  at  the  infinite-horizon  limit.  This  is  true  for  example  in  the 
celebrated  game  of  the  "prisoner's  dilemma,"  which  is  depicted  in  Figure  8. 
When  played  only  once,  the  game  has  a  unique  Nash  equilibrium,  as  it  is  a 
dominant  strategy  for  each  player  to  "fink."   "Never  fink"  is  a  perfect 
equilibrium  outcome  in  the  infinitely-repeated  game  if  players  are  sufficiently 
patient. 

With  a  finite  horizon,  cooperation  is  ruled  out  by  an  iterated  dominance 
argument:  Finking  in  the  last  period  dominates  cooperating  there;  iterating 
once,  both  players  fink  in  the  second  period,  etc.  The  infinite-horizon  game 
lacks  a  last  period  and  so  the  dominance  argument  cannot  get  started.  Should 
we  then  reject  the  cooperative  equilibria  as  technical  artifacts,  and  conclude 
that  the  "reasonable  solution"  of  the  finitely-repeated  prisoner's  dilemma  is 
always  fink"?   Considerable  experimental  evidence  shows  that  subjects  do  tend 
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to  cooperate  in  many  if  not  most  periods.   Thus,  rather  than  reject  the 
cooperative  equilibria,  we  should  change  the  model  to  provide  an  explanation 
of  cooperation.  Perhaps  players  derive  an  extra  satisfaction  from  "cooperating" 
beyond  the  rewards  specified  by  the  experimenters.  While  this  explanation  does 
not  seem  implausible,  it  seems  a  bit  too  convenient.   Other  explanations  do  not 
add  a  payoff  for  cooperation  per  se,  but  instead  change  the  model  to  break  the 
backwards -induct ion  argument,  which  is  argued  to  be  unreasonable.   One  way  of 
doing  this  is  developed  in  the  "reputation  effects"  models  of  Kreps ,  Milgrom, 
Roberts,  and  Wilson,  which  we  discuss  in  Section  3.   These  models  assume,  not 
that  all  players  prefer  cooperation,  but  that  each  player  attaches  a  very  small 
prior  probability  to  the  event  that  his  opponent  does. 

Radner  (1980)  provides  another  way  of  derailing  the  backwards  induction 
in  the  finitely-repeated  game.   He  observes  that  the  best  response  against  an 
opponent  who  will  not  fink  until  you  do,  but  will  fink  thereafter  (the  "grim" 
strategy)  is  to  cooperate  until  the  last  period,  and  then  fink.   Moreover,  as 
the  horizon  T  grows,  the  average  gain  (the  gain  divided  by  T  )  to  playing 
this  way  instead  of  always  cooperating  goes  to  zero.   Formally,  in  an 
E -equilibrium,   player's  strategy  gives  him  within  z      of  his  best  attainable 
payoff  (over  the  whole  horizon) ;  in  a  subgame-perfect  e -equilibrium  this  is  true 
in  every  subgame.   Radner  shows  that  cooperation  is  the  outcome  of  a  perfect 
£ -equilibrium  for  any  e  >  0   if  players  maximize  their  average  payoff  and  the 
horizon  is  sufficiently  long.   Radner 's  result  relies  on  "rescaling"  the 
player's  utility  functions  by  dividing  by  the  length  of  the  game.   Thus 
one-period  gains  become  relatively  unimportant  (compared  to  the  fixed   e  )  as 
the  horizon  grows. 

Fudenberg-Levine  (1983)  show  that  if  players  discount  the  future  then  the 
E -equilibrium,  finite  horizon  approach  gives  "exactly"  the  same  conclusions  as 
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the  infinite-horizon  one:   the  set  of  infinite-horizon  (perfect)  equilibria 

coincides  with  the  set  of  limit  points  of  finite  horizon  (perfect) 

E -equilibria,  where  z     goes  to  zero  as  the  horizon  T  goes  to  infinity.  That 

is,  every  such  limit  point  is  on  infinite-horizon  equilibria,  and  every 

infinite-horizon  equilibrium  can  be  approximated  by  a  convergent  sequence  of 

finite  horizon  e -equilibria.   Fudenberg-Levine  defined  the  "limits"  in  the 

above  with  respect  to  a  topology  that  requires  the  action  played  to  be  uniformly 

close  in  every  subgame.   In  finite-action  games  (games  with  a  finite  number  of 

actions  per  period)  this  reduces  to  the  condition  that  (s  )-*'s   if 

s   and  s   exactly  agree  in  the  first  k  periods  for  all  initial  histories, 

n  JO  ^    r 

where  k  ■*'0  as  n^*-  .   Harris  (1985a)  shows  that  this  simpler  convergence 
condition  can  be  used  in  most  games,  and  dispenses  with  a  superfluous 
requirement  that  payoffs  be  continuous. 

With  either  of  the  Fudenberg-Levine  or  Harris  topology,  the  strategy  spaces 
are  compact  in  finite-action  games,  so  that  the  limit  result  can  be  restated 
as  follows:   Let   r(E,  T)   be  the  correspondence  yielding  the  set  of 
E -equilibria  of  the  T-period  game.   Then  T     is  continuous  at   (0,~)  .   This 
continuity  allows  one  to  characterize  infinite-horizon  equilibria  by  working 
with  finite-horizon  ones.   Backwards -induct ion  can  be  applied  to  the  latter, 
albeit  tediously,  but  not  to  the  former,  so  that  working  with  the  finite  horizon 
E -equilibria  is  more  straightforward.   The  continuity  result  holds  for 
discounted  repeated  games,  and  for  any  other  game  in  which  players  are  not  too 
concerned  about  actions  to  be  taken  in  the  far-distant  future.   (It  does  not 
hold  in  general  for  the  time-average  payoffs  considered  b}^  Radner.) 
Specifically,  preferences  over  outcome  paths  need  not  be  additively  separable 
over  time,  and  there  can  be  links  between  past  play  and  future  opportunities. 
In  particular  the  result  covers  the  non-repeated  games  discussed  later  in  this 
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section.   The  intuition  is  simply  that  if  players  are  not  too  concerned  about 
the  future,  the  equilibria  of  the  infinite-horizon  game  should  be  similar  to 
the  equilibria  of  the  "truncated"  game  in  which  no  choices  are  allowed  after 
some  terminal  time  T  .   So  for  any  equilibrium  s  of  the  infinite  horizon  game 
and   E  >  0  ,  by  taking  T  long  enough,  the  difference  in  each  player's  payoff 
between  the  play  prescribed  by   s   and  that  obtained  by  truncating  s   at  time 
t  will  be  of  order  z    . 

We  should  point  out  that  the  "epsilons"  are  not  always  needed  to  ensure 
continuity  at  the  infinite-horizon  limit.   One  example  is  Rubinstein's  (1982) 
bargaining  game,  which  even  with  an  infinite  horizon  has  a  unique  perfect 
equilibrium.   (Rubinstein  allows  players  to  choose  from  a  continuum  of  sharing 
rules  between  0  and  1  .  With  a  finite  grid  of  shares,  the  uniqueness  result 
requires  that  each  player  prefers  the  second- largest  partition  today  to  the 
largest  one  tomorrow,  so  that  the  grid  must  be  very  fine  if  the  discount  factors 
are  near  to  one.)   Benoit-Krishna  (1985)  provide  conditions  for  continuity  to 
obtain  in  the  "opposite"  way,  with  the  set  of  finite-horizon  equilibria 
expanding  as  the  horizon  grows,  and  approaching  the  limit  set  given  by  the  folk 
theorem.   (Friedman  (1984)  and  Fraysse-Moreaux  (1985)  give  independent  but  less 
complete  analyses.)   For  Nash  equilibria  this  is  true  as  long  as  the  static 
equilibria  give  all  players  more  than  their  minmax  values.   Then  any 
individually-rational  payoffs  can  be  enforced  in  all  periods  sufficiently 
distant  from  the  terminal  date  by  the  threat  that  any  deviations  result  in  the 
deviator  being  minmaxed  for  the  rest  of  the  game.   Such  threats  are  not 
generally  credible,  so  proving  the  analogous  result  for  perfect  equilibria  is 
more  difficult.   Benoit-Krishna  show  that  the  result  does  hold  for  perfect 
equilibria  if  each  player  has  a  strict  preference  for  one  static  equilibrium 
as  opposed  to  another  (in  particular  there  must  be  at  least  two  static 
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equilibria)  and  the  Fudenberg-Maskin  full  dimensionality  condition  is 
satisfied.  The  construction  that  Benoit-Krishna  use  to  prove  this  is  too 
intricate  to  explain  here,  but  it  is  easy  to  see  that  there  can  be  perfect 
equilibria  of  a  finitely-repeated  game  which  are  not  simply  a  succession  of 
static  equilibria.   Consider  the  game  in  Figure  9. 

There  are  two  pure-strategy  static  equilibria,  (U,L)  and  (M,M).   In  the 
twice-repeated  game  (without  discounting,  for  simplicity)  there  is  an 
equilibrium  with  total  payoffs  (-1,-1).   These  payoffs  result  from  the 
strategies  "pl^Y  (D,R)  in  the  first  period;  play  (U,L)  in  the  second  iff  (D,R) 
was  played  in  the  first,  otherwise,  play  (M,M)." 

2B.   Continuous-time  Games 

Frequently,  continuous -time  models  seem  simpler  and  more  natural  than  models 
with  a  fixed,  non-negligible  period  length.  For  example,  differential  equations 
can  be  easier  to  work  with  than  difference  equations.  As  in  games  with  a 
continuum  of  actions,  continuous -time  games  may  fail  to  have  equilibria  in  the 
absence  of  continuity''  conditions.  More  troublesome,  there  are  deep  mathematical 
problems  in  formulating  general  continuous -time  games. 

As  Anderson  (1985)  observes,  "general"  continuous -time  strategies  need  not 
lead  to  a  well-defined  outcome  path  for  the  game,  even  if  the  strategies  and 
the  outcome  path  are  restricted  to  be  continuous  functions  of  time.   He  offers 
the  example  of  a  two-player  game  where  players  simultaneously  choose  actions 
on  the  unit  interval.  Consider  the  continuous -time  strategy  "play  at  each  time 
t  the  limit  as  r"^t   of  what  the  opponent  has  played  at  times   r  previous 
to  t  ."  This  limit  is  the  natural  analog  of  the  discrete-time  strategies  "match 
the  opponent's  last  action."  If  at  all  times  before  t  the  players  have  chosen 
matching  actions,  and  the  history  is  continuous,  there  is  no  problem  in 
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computing  what  should  be  played  at  t  .   However,  there  is  not  a  unique  way  of 
extending  the  outcome  path  beyond  time  t  .   Knowing  play  before  t  determines 
the  outcome  at  t  ,  but  is  not  sufficient  to  extend  the  outcome  path  to  any  open 
interval  beyond   t  .   As  a  result  of  this  problem,  Anderson  opts  to  study  the 
limits  of  discrete-time  equilibria  instead  of  working  with  continuous  time. 

Continuous  time  formulations  are  fairly  tractable  when  strategies  depend 
on  a  "small"  set  of  histories.  This  is  the  case  in  stopping-time  games, 
open- loop  games,  and  in  situations  where  players  use  "state-space  strategies. 
These  games  or  strategies  are  not  restricted  to  continuous  time,  and 
discrete-time  versions  of  all  of  them  have  been  used  in  the  industrial 
organization  literature. 

2C.    State-Space  or  Markov  Equilibria 

Consider  games  in  which  players  maximize  the  present  value  of  instantaneous 
flow  payoffs,  which  may  depend  on  state  variables  as  well  as  current  actions. 
(The  feasible  actions  may  also  depend  on  the  state.)   For  example,  current 
actions  could  be  investment  decisions,  and  the  state  could  be  the  stocks  of 
machinery.   Or  current  actions  could  be  expenditures  on  R&D,  with  the  state 
variables  representing  accumulated  knowledge.   The  strategy  spaces  are 
simplified  by  restricting  attention  to  "state-space"  (or  "Markov")  strategies 
that  depend  not  on  the  complete  specification  of  past  play,  but  only  on  the 
state  (and,  perhaps,  on  calendar  time.)   A  state-space  or  Markov  equilibrium 
is  an  equilibrium  in  state-space  strategies,  and  a  perfect  state-space 
equilibrium  must  yield  a  state-space  equilibrium  for  every  initial  state.  Since 
the  past's  influence  on  current  and  future  payoffs  and  opportunities  is 
summarized  in  the  state,  if  one's  opponents  use  state-space  strategies,  one 
could  not  gain  by  conditioning  one's  play  on  other  aspects  of  the  history.   Thus 
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a  state-space  equilibrium  is  an  equilibrium  in  a  game  with  less  restricted 
strategies.   The  state-space  restriction  can  however  rule  out  equilibria,  as 
shown  by  the  infinitely-repeated  prisoner's  dilemma.   Since  past  play  has  no 
effect  on  current  payoffs  or  opportunities,  the  state-space  is  null,  and  all 
state-space  strategies  must  be  constants.  Thus  the  only  state-space  equilibrium 
is  for  both  sides  to  always  fink.   (Caution:   this  conclusion  may  be  due  to  a 
poor  model,  and  not  to  the  wrong  equilibrium  concept.  Section  4E  shows  how  the 
conclusion  is  reversed  in  a  slightly  different  model.) 

Maskin-Tirole  (1985),  using  the  Markov  restriction,  obtain  collusion  in  a 
repeated  price  game  in  which  prices  are  locked  in  for  two  periods.   They  argue 
that  what  is  meant  by  "reaction"  is  often  an  attempt  by  firms  to  react  to  a  state 
that  affects  their  current  profits;  for  instance,  when  facing  a  low  price  by 
their  opponents,  they  may  want  to  regain  market  share.   In  the  classic  repeated 
game  model,  firms  move  simultaneously,  and  there  is  no  physical  state  to  react 
to.   If,  however,  one  allows  firms  to  alternate  moves,  they  can  react  to  their 
opponent's  price.   (Maskin-Tirole  derive  asynchronicity  as  the  (equilibrium) 
result  of  the  two-period  commitments.)   The  possibility  of  reaction  leads  to 
interesting  Markov  equilibria.   However,  although  equilibrium  payoffs  are 
bounded  away  from  the  competitive  levels  (in  contrast  to  the  folk  theorem 
approach),  they  are  still  many  equilibria  (Haskin-Tirole  use 

renegotiation-proofness  to  select  one  which  exhibits  the  classic  "kinked  demand 
curve.")  Gertner  (1985b)  formalizes  collusion  with  Markov  strategies  when 
commitment  (inertia)  takes  the  form  of  a  fixed  cost  of  changing  prices. 

The  literal  definition  of  a  state  says  that  strategies  can  depend  "a  lot" 
on  variables  with  very  little  influence  on  payoffs,  but  they  cannot  depend  at 
all  on  strategies  that  have  no  influence.   This  can  generate  rather  silly 
discontinuities.  For  example,  we  can  restore  the  cooperative  equilibria  in  the 


42 


repeated  prisoner's  dilemma  by  adding  variables  that  keep  track  of  the  number 
of  times  each  player  has  finked.   If  these  variables  have  an  infinitesimal 
effect  on  the  flow  payoffs,  the  cooperative  equilibria  can  be  restored. 

The  state-space  restriction  does  not  always  rule  out  "supergame  -type 
equilibria,  as  shown  in  Fudenberg-Tirole  (1983a).   They  reconsidered  a  model 
of  continuous-time  investment  that  had  been  introduced  by  Spence  (1979).   Firms 
choose  rates  of  investment  in  productive  capacity.   The  cost  of  investment  is 
linear  in  the  rate  up  to  some  upper  bound,  with  units  chosen  so  that  one  unit 
of  capital  costs  one  dollar.   If  firms  did  not  observe  the  investment  of  their 
rivals,  each  firm  would  invest  up  to  the  point  where  its  marginal  productivity 
of  capital  equalled  the  interest  rate.   The  capital  levels  at  this  "Cournot" 
point  exceed  the  levels  the  firms  would  choose  if  they  were  acting  collusively, 
because  each  firm  has  ignored  the  fact  that  its  investment  lowers  its  rivals' 
payoffs.   Now,  if  firms  observe  their  rivals'  investment  (in  either  discrete 
or  continuous  time)  they  could  play  the  strategy  of  stopping  investment  once 
the  collusive  levels  are  reached.   This  "early  stopping"  is  enforced  by  the 
(credible)  threat  that  if  any  firm  invests  past  the  collusive  level,  all  firms 
will  continue  to  invest  up  to  the  "Cournot"  levels.  The  state-space  restriction 
seems  to  have  little  force  in  this  game.   There  are  no  general  results  on  when 
the  restriction  is  likely  to  have  a  significant  impact. 

State-space  games  closely  resemble  control  problems,  so  it  is  not  surprising 
that  they  have  been  studied  by  control  theorists.   Indeed,  the  idea  of 
perfection  is  just  the  many-player  version  of  dynamic  programming,  and  it  was 
independently  formulated  by  Starr-Ho  (1967)  in  the  context  of  nonzero-sum 
differential  games.   The  differential  games  literature  restricts  attention  to 
state-space  equilibria  in  which  the  equilibrium  payoffs  are   continuous  and 
almost-evervwhere  dif ferentiable  functions  of  the  state.   These  conditions 
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obtain  naturally  for  control  problems  in  smooth  environments,  but  they  impose 
significant  restrictions  in  games:   It  might  be  that  each  player's  strategy, 
and  thus  each  player  s  payoff,  change  discontinuously  with  the  state  due  to  the 
self-fulfilling  expectation  that  the  other  players  use  discontinuous 
strategies.   This  was  the  case  in  the  "early-stopping"  equilibria  of  the  last 
paragraph,  so  those  equilibria  would  not  be  admissible  in  the  differential  games 
setting.   Perhaps  the  continuity  restriction  can  be  justified  by  the  claim  that 
the  "endogenous  discontinuities"  that  they  prohibit  require  excessive 
coordination,  or  are  not  robust  to  the  addition  of  a  small  amount  of  noise  in 
the  players'  observations.   We  are  unaware  of  formal  arguments  along  these 
lines. 

The  technical  advantage  of  restricting  attention  to  smooth  equilibria  is 
that  necessary  conditions  can  then  be  derived  using  the  variational  methods  of 
optimal  control  theory.  Assume  that  player  i  wishes  to  choose  a.   to  maximize 
the  integral  of  his  flow  payoff  it   ,  subject  to  the  state  evolution  equation 


(2)      k(t)  =  f(k(t))  ,  k(0)  =  kp 


Introducing  costate  variables   X .  ,  we  define  H.  ,  the  Hamiltonian  for  player 
i  ,  as 


(3)      H^  =  TT^(k,a,t)  +  X.f(k(t)) 


A  state-space  equilibrium  a(t)   must  satisfy 


(4) 


a^  =  a.(k,t)  maximizes  H.(k,t,a,X.) 
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and 


(5)      X.  -  9H./3k.  -  Z  9H./3a.   8a. /3k.  ,  along  with  the  appropriate 
1     i'   1  j^i   1   J    J    1 

transversality  condition. 

Notice  that  for  a  one-player  game  the  second  term  in  (5)  vanishes,  and  the 
conditions  reduce  to  the  familiar  ones.   In  the  n-player  case,  this  second  term 
captures  the  fact  that  player  i  cares  about  how  his  opponents  will  react  to 
changes  in  the  state.   Because  of  the  cross-influence  term,  the  evolution  of 
X   is  determined  by  a  system  of  partial  differential  equations,  instead  of  by 
ordinary  differential  equations  as  in  the  one-player  case.   As  a  result,  very 
few  differential  games  can  be  solved  in  closed  form.   An  exception  is  the 
linear-quadratic  case,  which  has  been  studied  by  Starr-Ho  among  others.   Hanig 
(1985)  and  Reynolds  (1985)  consider  a  linear-quadratic  version  of  the 
continuous -time  investment  game.   (Their  model  is  that  of 
Spence-Fudenberg-Tirole,  except  that  the  cost  of  investment  increases 
quadratically  in  the  rate.)   They  show  that  the  "smooth"  equilibrium  for  the 
game  has  higher  steady-state  capital  stocks  and  so  lower  profits ,  than  the 
static  "Cournot"  levels.   Is  this  a  better  prediction  than  the  collusive  levels? 
We  do  not  know. 

Judd  (1985)  offers  an  alternative  to  the  strong  functional  form  assumptions 
typically  invoked  to  obtain  closed  form  solution  to  differential  games.   His 
method  is  to  analyze  the  game  in  the  neighborhood  of  a  parameter  value  that 
leads  to  a  unique  and  easily  computed  equilibrium.   In  his  examples  of  patent 
races,  he  looks  at  patents  with  almost  zero  value.  Obviously  if  the  patent  has 
exactly  zero  value,  in  the  unique  equilibrium  players  do  no  RS.D  and  have  zero 
values.   Judd  proceeds  to  expand  the  system  about  this  point,  neglecting  all 
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terms  over  third  order  in  the  value  of  the  patent.   Judd's  method  gives  only 
local  results,  but  it  solves  an  "open  set"  in  the  space  of  games,  as  opposed 
to  conventional  techniques  that  can  be  thought  of  as  solving  a  lower-dimensional 
subset  of  them.   We  encourage  the  interested  reader  to  consult  Judd's  paper 
for  the  technical  details. 

2D.   Games  of  Timing 

In  a  game  of  timing,  each  player's  only  choice  is  when  and  whether  to  take 
a  single  pre-specified  action.  Few  situations  can  be  exactly  be  described  this 
way,  because  players  typically  have  a  wider  range  of  choices.   For  example, 
firms  typically  do  not  simply  choose  a  time  to  enter  a  market,  but  also  decide 
on  the  scale  of  entry,  the  type  of  product  to  produce,  etc.   This  detail  can 
prove  unmanageable,  which  is  why  industrial  organization  economists  have 
frequently  abstracted  it  away  to  focus  on  the  timing  question  in  isolation. 

We  will  not  even  try  to  discuss  all  games  of  timing,  but  only  two-player 
games  which  "end"  once  at  least  one  player  has  moved.   Payoffs  in  such  games 
can  be  completely  described  by  six  functions  L.(t)  ,   F.(t),   and  B.(t)  , 
i=l,2.   Here  L.   is  player  i's  payoff  if  player   i  is  the  first  to  move 
(the  "leader"),   F.   is   i's   payoff  if   j   is  the  first  to  move  (the 
"follower"),  and  B.   is  i's  payoff  if  both  players  move  simultaneously.   This 
framework  is  slightly  less  restrictive  than  it  appears,  in  that  it  can 
incorporate  games  which  continue  until  both  players  have  moved.  In  such  games, 
once  one  player  has  moved,  the  other  one  faces  a  simple  maximization  problem, 
whdch  can  be  solved  and  "folded  back"  to  yield  the  payoffs  as  a  function  of  the 
time  of  the  first  move  alone.   A  classic  example  of  such  a  game  is  the  "war  of 
attrition,"  first  analyzed  by  Maynard  Smith  (1974):   Two  animals  are  fighting 
for  a  prize  of  value  v;  fighting  costs  one  util  per  unit  time.   Once  one  animal 
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quits,  his  opponent  wins  the  prize.   Here  L(t)   is   -t,   and  F(t)   is  v-t  . 
With  short  time  periods  B(t)  will  turn  out  to  not  matter  much;  let's  set  it 
equal  to  v/q-t  ,  q  >  2  .   If  q=2   then  each  player  has  probability  1/2  of 
winning  the  prize  if  both  quit  at  once;  if  q=<»  this  probability  is  zero.   Let 
us  solve  for  a  symmetric  equilibrium  of  the  discrete  time  version  with  period 
length   A  .   Let  p  be  the  probability  that  either  player  moves  at   t  when 
both  are  still  fighting.   For  players  to  use  stationary  mixed  strategies,  the 
payoff  to  dropping  out,  pv/q  ,  must  equal  that  to  fighting  one  more  period  and 
then  dropping  out,   pv+(l-p)   pv/q  -  A  .   Equating  these  terms  yields 


p  =  (l-(l-4A/qv)^/^)q/2  . 


[Dropping  out  with  probability  p  per  period  is  a  "behavioral  strategy;"  the 
corresponding  mixed  strategy  is  an  exponential  distribution  over  stopping 
times . ] 

Let  us  note  that  as   A^^O  ,   p"*'A/v  ,  independent  of   q  .   More  generally,  a  war 
of  attrition  is  a  game  of  "chicken",  in  which  each  player  prefers  his  opponent 
to  move  (F(t)>L(t)),  and  wishes  that  he  would  do  so  quickly  (F  and  L  decrease 
over  time.)   Weiss-Wilson  (1984)  characterize  the  equilibria  of  a  large  family 
of  discrete-time  wars  of  attrition;  Hendricks-Wilson  do  the  same  for  the 
continuous -time  version.   Section  3  describes  some  of  the  many 
incomplete-information  wars  of  attrition  that  have  been  applied  to  oligopoly 
theory . 

Preemption  games  are  the  opposite  case,  with  L(t)  >  F(t)  ,  at  least  over 
some  set  of  times.   Here  the  specification  of   B(t)  is  more  important,  as  if 
B  exceeds  F  we  might  expect  both  players  to  move  simultaneously.   One  example 
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of  a  preemption  game  is  the  decision  of  when  and  whether  to  build  a  new  plant 
or  adopt  a  new  innovation,  when  the  market  is  only  big  enough  to  support  one 
such  addition.   (If  each  firm  will  eventually  build  a  plant,  but  the  second 
mover  would  optimally  choose  to  wait  until  long  after  the  first  one,  we  can 
"fold  back"  the  second  mover's  choice  to  get  the  payoffs  as  a  function  of  the 
time  of  the  first  move  alone.) 

The  relationship  between  L  and  F  can  change  over  time,  and  the  two 
players  may  have  different  "types"  of  preferences,  as  in  Katz-Shapiro  (1984). 
No  one  has  yet  attempted  a  general  classification  of  all  games  of  timing. 
Because  the  possible  actions  and  histories  are  so  limited,  it  is  easy  to 
formulate  continuous -time  strategies  for  these  games,  in  a  way  that  permits  a 
well-defined  map  from  strategies  to  outcomes.   We  develop  these  strategies 
below.   However,  we  will  see  that  the  simplicity  of  this  formulation  is  not 
without  cost,  as  it  is  not  rich  enough  to  represent  some  limits  of  discrete-time 
strategies.  That  is,  there  are  distributions  over  outcomes  (who  moves  and  when) 
that  are  the  limits  of  distributions  induced  by  discrete-time  strategies,  but 
which  cannot  be  generated  by  the"obvious"  continuous  time  strategies. 

The  usual  and  simple  continuous -time  formulation  is  that  each  player's 
strategy  is  a  function  G.(t)   which  is  non-decreasing,  right - cont inuous ,  and 
has  range  in  (0,1).  Tliis  formulation  was  developed  in  the  1950 's  for  the  study 
of  zero-sum  "duels",  and  was  used  by  Pitchik  (1982),  who  provides  several 
existence  theorems.  The  interpretation  is  that  G.   is  a  distribution  function, 
representing  the  cumulative  probability  that  player  i  has  moved  by  time  t 
conditional  on  the  other  player  not  having  moved  previously.  These  distribution 
functions  need  not  be  continuous;  a  discontinuity  at  time  t  implies  that  the 
player  moves  with  non-zero  probability  at  exactly  time  t.   Where  G   is 
differentiable,  its  derivative  dG  is  the  density  which  gives  the  probability 
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of  a  move  over  a  short  time  interval.   With  this  notation,  player  one's  payoff 
to  the  strategies  G   ,  G   is 


V^(G  G  )  =  /'°[L(s)(l-G.(s))dG.(s)+F(s)(l-G.(s))dG.(s)]+Ia  (s)a  (s)B(s)  , 

where  a.(s)  is  the  size  of  the  jump  in  G.   at   s  . 

This  formulation  is  very  convenient  for  wars  of  attrition.   In  these  games 
there  are  "nice"  discrete-time  equilibria  in  which  the  probability  of  moving 
in  each  period  is  proportional  to  the  period  length.   In  the  example  computed 
above,  the  equilibrium  strategies  converged  to  the  continuous -time  limit 
G(t)=l-exp(-t/v)  .   (For  the  case  q=2  ,  the  sum  of  the  two  players'  payoffs 
is  upper  hemi-continuous ,  and  the  fact  that  the  equilibria  converge  is  a 
consequence  of  Theorem  5  of  Dasgupta-Maskin  (1986).)   More  complex  wars  of 
attrition  can  have  continuous -time  equilibria  with  "atoms,"  (o.(t)>0   for  some 
t),  but  as  the  periods  shrink  these  atoms  become  isolated,  and  again  admit  a 
nice  continuous-time  representation. 

Preemption  games  are  markedly  different  in  this  respect,  as  shown  in 
Fudenberg-Tirole  (1985).   Consider  the  discrete-time  "grab-the  dollar"  game: 
L(t)=l,  F(t)=0,  and  B(t)=  -1.   The  interpretation  is  that  "moving"  here  is 
grabbing  a  dollar  which  lies  between  the  two  players.   If  either  grabs  alone, 
he  obtains  the  dollar,  but  simultaneous  grabbing  costs  each  player  one.   There 
is  a  symmetric  equilibrium  in  which  each  player  moves  with  (conditional) 
probability  1/2  in  each  period.   Note  well  that  the  intensity  of  the 
randomization  is  independent  of  the  period  length.   The  corresponding  payoffs 
are  (0,0),  and  the  distribution  over  outcomes  is  that  with  identical  probability 
(1/4) '    '   either  player  one  wins  (moves  alone)  in  period  t,  or  player  two  wins 
in  period  t,  or  both  move  at  once  at  t.   As  the  length  of  the  period  converges 
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to  zero,  this  distribution  converges  to  one  in  which  the  game  ends  with 
probability  one  at  the  start,  with  equal  probabilities  of  1/3  that  player  one 
wins,  that  player  two  does,  or  that  they  both  move  at  once.   This  distribution 
cannot  be  implemented  with  the  continuous-time  strategies  described  above,  for 
it  would  require  a  correlating  device  a  I'Aumann.   Otherwise,  at  least  one 
player  would  move  with  probability  one  at  the  start,  which  would  make  it 
impossible  for  his  opponent  to  have  a  1/3  probability  of  winning.   The  problem 
is  that  a  great  many  discrete-time  strategies  converge  to  a  continuous -time 
limit  in  which  both  players  move  with  probability  one  at  time  zero,  including 
"move  with  probability  1/2  each  period,"  and  "move  with  probability  one  at  the 
start."  The  usual  continuous -time  strategies  implicitly  associate  an  atom  of 
size  one  with  an  atom  of  that  size  in  discrete  time,  and  thus  they  cannot 
represent  the  limit  of  the  discrete-time  strategies.   Fudenberg-Tirole  offered 
an  expanded  notion  of  continuous  time  strategies  that  "works"  for  the  grab-the 
dollar  game,  but  they  did  not  attempt  a  general  treatment  of  what  the  strategy 
space  would  need  to  be  to  handle  all  games  of  timing. 

The  moral  of  this  story  is  that  while  continuous  time  is  often  a  convenient 
idealization  of  very  short  time  periods,  one  should  keep  in  mind  that  a  given 
formulation  of  continuous  time  may  not  be  adequate  for  all  possible 
applications.   When  confronted  with,  for  example,  the  non-existence  of 
equilibria  in  a  seemingly  "nice"  continuous -time  game,  it  can  be  useful  to  think 
about  discrete-time  approximations.   Simon  and  Stinchcombe  (1985)  provide  a 
general  analysis  of  when  the  usual  continuous-time  strategies  are  in  fact 
appropriate. 


2E. 


Discrete  vs.  Continuous  Time,  and  the  Role  of  Period  Len.°th 
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The  discussion  above  stressed  that  one  must  be  careful  that  a  given 
continuous  time  model  is  rich  enough  to  serve  as  the  "appropriate"  idealization 
of  very  short  time  periods.  Now  we'd  like  to  point  out  that  new  equilibria  can 
arise  in  passing  to  continuous  time,  and  that  these  should  not  be  discarded  as 
pathological.   The  simplest,  and  oldest,  example  of  this  fact  is  the 
Kreps-Wilson  (1982a)  stopping-time  version  of  the  prisoner's  dilemma.   In  this 
version,  players  begin  by  cooperating,  and  once  either  finks  they  both  must  fink 
forever  afterwards.  Thus  the  only  choice  players  have  is  when  to  fink  if  their 
opponent  has  not  yet  done  so.   In  discrete  time  with  a  finite  horizon,  the 
familiar  backwards  induction  argument  shows  that  the  only  equilibrium  is  to  both 
fink  at  once.   However,  the  gain  to  finking  one  period  ahead  of  one's  opponent 
is  proportional  to  the  period  length,  and  in  the  continuous -time  limit,  there 
is  no  gain  to  finking.   Thus  cooperation  is  an  equilibrium  in  the 
continuous -time  game. 

The  analogy  with  the  finite-to-infinite  horizon  limit  is  more  than 
suggestive.   In  a  generalization  of  their  earlier  work,  Fudenberg-Levine  (1986) 
showed  that,  in  cases  such  as  stopping-time  games  and  state-space  games  where 
the  continuous  time  formulation  is  not  too  problematic,  any  continuous -time 
equilibrium  is  a  limit  of  discrete-time  epsilon-equilibria,  where  the  epsilon 
converges  to  zero  with  the  length  of  the  period. 

Little  is  known  in  general  about  the  effect  of  period  length  on  equilibrium 
play,  but  several  examples  have  been  intensively  studied.   The  best  known  is 
the  work  of  Coase  (1972),  Bulow  (1982),  Stokey  (1981),  and 

Gul-Sonnenschein-Wilson  (1986)  who  argue  with  varying  degrees  of  formality  that 
the  monopolistic  producer  of  a  durable  good  loses  the  power  to  extract  rents 
as  the  time  period  shrinks,  thus  verifying  the  "Coase  conjecture."   (See  also 
Sobel-Takahashi  (1983)  and  Fudenberg-Levine-Tirole  (1985).) 
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2F.   Open-Loop  Equilibria 

The  terms  "open- loop"  and  "closed- loop"  refer  to  two  different  information 
structures  for  multi-stage  dynamic  games.  In  an  open- loop  model,  players  cannot 
observe  the  play  of  their  opponents;  in  a  closed-loop  model  all  past  play  is 
common  knowledge  at  the  beginning  of  each  stage.  Like  "Cournot"  and  "Bertrand" 
equilibria,  open-  and  closed-loop  equilibria  are  shorthand  ways  of  referring 
to  the  perfect  equilibria  of  the  associated  model.   (Caution:  This  terminology 
is  widespread  but  not  universal.   Some  authors  use  "closed- loop  equilibrium" 
to  refer  to  all  the  Nash  equilibria  of  the  closed- loop  model.   We  prefer  to 
ignore  the  imperfect  equilibria.)  Open  and  closed  loop  models  embody  different 
assumptions  about  the  information  lags  with  which  players  observe  and  respond 
to  each  other's  actions,  and  thus  about  the  length  of  time  to  which  players  can 
"commit"  themselves  not  to  respond  to  their  opponents.   In  an  open- loop  model, 
these  lags  are  infinite,  while  in  a  closed- loop  model,  a  player  can  respond  to 
his  opponents.   Because  dynamic  interactions  are  limited  in  open- loop 
equilibria,  they  are  more  tractable  than  closed-loop  ones.   For  this  reason, 
economists  have  sometimes  analyzed  the  open-loop  equilibria  of  situations  which 
seem  more  naturally  to  allow  players  to  respond  to  their  opponents.   One 
possible  justification  for  this  is  that,  if  there  are  many  "small"  players,  so 
that  no  one  player  can  greatly  affect  the  others,  then  optimal  reactions  should 
be  negligible.   When  this  is  true,  the  open-loop  equilibria  will  be  a  good 
approximation  of  the  closed- loop  ones.   Fudenberg-Levine  (1987)  explore  this 
argument,  and  find  that  its  validity  in  a  T-period  game  requires  strong 
conditions  on  the  first  through  T-th  derivatives  of  payoffs. 
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Section  5:   Static  Gaines  of  Incomplete  and  Imperfect  Information 
3A.  Bayesian  Games  and  Bayesian  Equilibrium 

Players  in  a  game  are  said  to  have  incomplete  information  if  they  do 
not  know  some  of  their  opponent's  characteristics  (objective  functions); 
they  have  imperfect  information  if  they  do  not  observe  some  of  their  oppo- 
nent's actions.  Actually,  the  distinction  between  incomplete  and  imperfect 
information  is  convenient,  but  artificial.   As  Harsanyi  [l  96?]  has  shown,  at  - 
a  formal  level,  one  can  always  transform  a  game  of  incomplete  information 
into  a  game  of  imperfect  information.   The  idea  is  the  following:   let  the 
original  game  be  an  n-player  game  with  incomplete  information.   Assume  that 
each  player's  characteristic  is  known  by  the  player,  but,  from  the  point  of 
view  of  the  (n-1 )  other  players,  is  drawn  according  to  some  known  probability 
distribution.   (See  below  for  a  discussion  of  this  representation). 
Harsanyi's  construction  of  a  transformed  game  introduces  nature  as  a  (n+1 )st 
player,  whose  strategy  consists  in  choosing  characteristics  for  each  of  the  n 
original  players  at  the  start  of  the  game,  say.  Each  player  observes  his  own 
characteristics,  but  not  the  other  players'.   Thus,  he  has  imperfect  informa- 
tion about  nature's  choice  of  their  characteristics.   (One  can  endow  nature 
with  an  objective  function  in  order  for  it  to  become  a  player.   One  way  of 
doing  so  is  to  assume  that  nature  is  indifferent  between  all  its  moves.   To 
recover  the  equilibria  of  the  original  game  (i.e.,  for  given  initial  probabi- 
lity distributions),  one  takes  the  projection  of  the  equilibrium  correspon- 
dence for  these  probability  distributions) . 

The  notion  of  "type";   The  "characteristic"  or  "type"  of  a  player  embod- 
ies everything  which  is  relevant  to  this  player's  decision  making.   This 
includes  the  description  of  his  objective  function  (fundamentals),  his  be- 
liefs about  the  other  player's  objective  functions'  (beliefs  about  fundamen- 
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tals) ,  his  beliefs  about  what  the  other  players  believe  his  objective  func- 
tion is  (beliefs  about  beliefs  about  fundamentals),  etc.   As  this  is  a  bit 
abstract,  it  is  helpful  to  begin  with  Harsanyi's  simple  representation  (this 
representation  is  used  in  virtually  all  applications).   Suppose  that  in  an 
oligopoly  context,  each  firm's  marginal  cost  c.  is  drawn  from  an  "objective" 
distribution  p.(c.)   (n.b.:  we  will  write  probability  distributions  as  if 
the  number  of  potential  types  were  finite.   Continuous  type  spaces  are  also 
allowed;  summation  signs  should  then  also  be  replaced  by  integral  signs);  c. 
is  observed  by  firm  i,  but  not  by  the  other  firms;  p.  is  common  knowledge; 
everybody  knows  that  c.  is  drawn  from  this  distribution;  that  everybody  knows 
that  c.  is  drawn  from  this  distribution,  etc.  ...^  In  this  case  firm  i's 
type  is  fully  summarized  by  c. :  because  the  probability  distributions  are 
common  knowledge,  knowing  c.  amounts  to  knowing  everything  known  by  firm  i. 
By  abuse  of  terminology,  one  can  identify  firm  i's  type  with  the  realization 
of  c^. 

More  generally,  Harsanyi  assumed  that  the  player's  types  {t.}^_.  are 
drawn  from  some  objective  distribution  p(t^,...,t  ),  where  t.  belongs  to  some 
space  T^.  For  simplicity,  let  us  assume  that  T.  has  a  finite  number  |T. |  of 
elements,   t.  is  observed  by  player  i  only.   p^(t  .  [t . )  denotes  player  i's 
conditional  probability  about  his  opponent's  types  t  .  =  (t.  ,...,t.  .,t._^., 
•••j'tj^)  given  his  type  t.. 

To  complete  the  description  of  a  Bayesian  game,  we  must  specify  an  ac- 
tion set  A^  (with  elements  a.)  and  an  objective  function  n.(a. ,..., 


^Aumann  [l  976J  formalizes  common  knowledge  in  the  following  way.   Let 
(0,p)  be  a  finite  probability  space  and  let  P  and  Q  denote  two  parti- 
tions of  Q  representing  the  informations  of  two  players.   Let  R  denote 
the  meet  of  P  and  Q  (i.e.,  the  finest  common  coarsening  of  P  and  Q) .  An 
event  E  is  common  knowledge  between  the  two  players  at  oj  e  Q  if  the 
event  in  R  that  includes  u  is  itself  included  in  E. 
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a  ,t t  )  for  each  player  i.  The  action  spaces  A.,  the  objective  func- 

n  1      n  r     "  r  -^ 

tions  IT.  and  the  probability  distribution  p  are  common  knowledge  (every  play- 
er knows  them,  knows  that  everybody  knows  them,  ...)•   In  other  words, 
everything  which  is  not  commonly  known  is  subsumed  in  the  type. 

The  Harsanyi  formulation  looks  a  priori  restrictive  because  it  presumes 
a  large  common  knowledge  base.  However,  as  Mertens  and  Zamir  [1  983]  have 
shown  (see  also  Brandenburger-Deckel  [l985])»  one  can  always  define  type 
spaces  that  are  large  enough  to  describe  every  element  of  player  i's  private 
information.   Coming  back  to  our  original  idea,  player  i's  type  then  includes 
his  beliefs  about  the  other  players'  beliefs  about  payoff  relevant  informa- 
tion, his  beliefs  about  the  other  players'  beliefs  about  payoff  relevant 
information  etc.  Mertens  and  Zamir  essentially  show  that  this  infinite  re- 
gression is  well  defined  (under  some  weak  assumptions,  the  space  of  types  is 
compact  for  the  product  topology) . 

In  this  section  we  consider  only  one-shot  .simultaneous  move  games  of 
incomplete  information.   The  n  players  first  learn  their  types  and  then  si- 
multaneously choose  their  actions  (note  that  the  game  is  also  a  game  of  im- 
perfect information).   The  game  is  static  in  that  the  players  are  unable  to 
react  to  their  opponent's  actions.   The  inference  process  as  to  the  other 
players'  types  is  irrelevant  because  the  game  is  over  at  the  time  each  player 
learns  some  signal  related  to  his  opponents'  moves.   Section  4  considers 
dynamic  games  and  the  associated  updating  process. 

Each  player's  decision  naturally  depends  on  his  information,  i.e.,  his 
type.  For  instance,  a  high  cost  firm  chooses  a  high  price.   Let  a.(t.)  de- 
note the  action  chosen  by  player  i  when  his  type  is  t.  (this  could  also  de- 
note a  mixed  strategy,  i.e.,  a  randomization  over  actions  for  a  given  type). 
If  he  knew  the  strategies  adopted  by  the  other  players  {a.(t.)}.,.  as  a  func- 

(J    «J    «J   "^ 
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tion  of  their  types,  player  i  would  be  facing  a  simple  decision  problem; 

given  his  type  t.,  he  ought  to  maximize: 

^t  Pi^^-i  l'''i)  Uj^(a^  (t^  ), .. .  ,a^,.. .  ,a^(t^),t^  ,..  .,t^, . . .  ,t^). 
-i 

Harsanyi  extended  the  idea  of  a  Nash  equilibrium  by  assuming  that  each  player 

correctly  anticipates  how  each  of  his  opponents  behaves  as  a  function  of  his 

type: 

Definition;  A  Bayesian  equilibrium  is  a  set  of  (type  contingent)  strategies 

{a*(t.)}.  ,    such  that  a*(t.)  is  player  i's  best  response  to  the  other  strate- 
1  1  1=1  11 

gies  when  his  type  is  t.: 

a|(t^)  z   arg  mgx  E^  Pi^^-i  \^^)'^^iB.^  (t^  )  ,  •  •  ,a^,  •  •  ,a^(tj^)  ,t^  , . .  ,t^, . .  ,t^) . 

i   -i 

Thus,  the  Bayesian  equilibrium  concept  is  a  straightforward  extension  of 
the  Nash  equilibrium  concept,  in  which  each  player  recognizes  that  the  other 
player's  strategies  depend  on  their  types. 

Proving  existence  of  a  Bayesian  equilibrium  turns  out  to  involve  a  sim- 
ple extension  of  the  proof  of  a  Nash  equilibrium.   The  trick  is  the  follow- 
ing:  since  player  i's  optimal  action  depends  on  type  t.,  everything  is  as  if 
player  i's  opponents  were  playing  against  |T.|  different  players,  each  of 
these  players  being  drawn  and  affecting  his  opponent's  payoffs  with  some 

probability.   Thus,  considering  different  types  of  the  same  player  as  differ- 

n 
ent  players  leads  to  transform  the  original  game  into  a  game  with  {Sj=i  tr^^l) 

players.  Each  "player"  is  then  defined  by  a  name  and  a  type.  He  does  not 
ca.re  (directly)  about  the  action  of  a  player  with  the  same  name  and  a  differ- 
ent type  (another  incarnation  of  himself),  but  he  does  care  about  the  other 

players'  actions.   If  a.,,   denotes  the  action  chosen  by  player  {i,t.},  play- 
...       1  t .  1 
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er  {i,t.}'s  objective  is  to  maximize  over  a: 

^t  Pi^^-il^i^^i^^l.t  '••'^'••'^n,t  >^^>-'rt^,..,t^). 
-i  1  n 

Thus,  existence  of  a  Bayesian  equilibrium  of  a  game  with  |T.  |  players  stems 

directly  from  the  existence  of  a  Nash  equilibrium  for  a  game  with  {T. |T. |} 

players,  as  long  as  the  numbers  of  players  and  types  are  finite. 

With  a  continuum  of  types,  some  technicalities  appear  about  whether 

there  exists  a  measurable  structure  over  the  set  of  random  variables  (Aumann 

[1964]).   One  is  then  led  to  define  a  mixed  strategy  as  a  measurable  function 

from  [0,1  ]xT.  into  A..   Or,  equivalently,  one  can  define  it,  as  Milgrom  and 

Weber  [1  985J  do,  as  a  measure  on  the  subjects  of  T.xA.  for  which  the  marginal 

distribution  on  T.  is  p. .  Milgrom  and  Weber  give  sufficient  conditions  for 

the  existence  of  an  equilibrium  in  such  settings. 

Kxnmple  1  :   Consider  a  duopoly  playing  Coumot  (quantity)  competition.   Let 

firm  i's  profit  be  quadratic:   IT.  =  q.  (t. -q. -q . ) ,  where  t.  is  the  difference 

1   ^1  1  ^1  ^j  1 

between  the  intercept  of  the  linear  demand  curve  and  firm  i's  constant  unit 
cost  (i  =  1  ,2)  and  q.  is  the  quantity  chosen  by  firm  i  (a.  =  q.).   It  is 
common  knowledge  that,  for  firm  1,  t.  =1  ("firm  2  has  complete  information 
about  firm  1",  or  "firm  1  has  only  one  potential  type").   Firm  2,  however, 
has  private  information  about  its  unit  cost.   Firm  1  only  knows  that  tp  =  3/4 
with  probability  1/2.   Thus,  firm  2  has  two  potential  types,  which  we  will 
call  the  "low  cost  type"  {t^   =  5/4)  and  the  "high  cost  type"  {t^   =  3/4).   The 

two  firms  choose  their  outputs  simultaneously.   Let  us  look  for  a  pure  stra- 

L  H 

tegy  equilibrium.  Firm  1  plays  q.  ,  firm  2  plays  q2(if  t  =  5/4)  or  q^  (if  tp 

=  3/4).  Let  us  start  with  firm  2: 

q2(t2)  e  arg  max  {q2(t2-q2-^1  )  ^  ">  "^2^*2^  "  (^2~^^^^^' 


57 


Let  us  now  consider  firm  1  ,  which  does  not  know  which  type  it  faces . 
q^  E  arg  maz  \j  q.^{^-q.^-q^)■*■  j  q.^{^-q^-q.2^} 

=>  q^  =  (l-Eq2)/2, 

1   H   1   L 
where  E(«)  denotes  an  expectation  over  firm  2's  types.  But  Eq^  =  -^  q^+  -^   q„ 

=  (Et2-q^)/2  =  (l-q^)/2.   One  thus  obtains  (q^  =  1/3,  ^   =  11/24,  <^   =  5/24} 

as  a  Bayesian  equilibrium  (one  can  prove  this  equilibrium  is  unique).   This 

simple  example  illustrates  how  one  can  compute  the  Bayesian  equilibrium  as  a 

Nash  equilibrium  of  a  5-player  game  (ItJ  =1,  |  !_  I  =  2). 

Example  2;   Consider  an  incomplete-information  version  of  the  war  of  attri- 
tion discussed  in  Section  2d.  Firm  i  chooses  a  number  a. in  [0,+=>).  Both 
firms  choose  simultaneously.   The  payoffs  are: 

n.  =  {-a.,  if  a.  >  a. 

t.-a.,  if  a.  <  a.  . 

t.,  firm  i's  type,  is  private  information  and  take  values  in  [0,+«>)  with 
ciimulative  distribution  function  P. (t.)  and  density  p. (t.).   Types  are,  as  in 
example  1  ,  independent  between  the  players,   t.  is  the  price  to  the  winner, 
i.e.,  the  highest  bidder.   The  game  resembles  a  second -bid  auction  in  that 
the  winner  pays  the  second  bid.   However,  it  differs  from  the  second-bid 
auction  in  that  the  loser  also  pays  the  second  bid. 

Let  us  look  for  a  Bayesian  equilibrium  of  this  game.  Let  a.(t.)  denote 
firm  i's  strategy.  Then,  we  require 


58 


a, (t.)  E  arg  max  {-a.  Prob(a  (t  )  >  a  )+  /  (t. -a.)}  . 

^  ^        a.    ^      J  J     1    {t.|a.(t.)<a.}  ^  ^ 
1  J  J  J   1 


A  few  tricks  make  the  problem  easy  to  solve.   First,  one  can  write  the  "self- 
selection  constraints":   by  definition  of  equilibrium,  type  t.  prefers  a.(t.) 
to  a.(t!),  and  type  t!  prefers  a.(t!)  to  a.(t.)'   Writing  the  two  correspond- 
ing inequalities  and  adding  them  up  shows  that  a .  must  be  a  non-decreasing 
function  of  t . .   Second,  it  is  easy  to  show  that  there  can  not  be  an  atom  at 
a.  >  0,  i.e.,  Prob  fa.(t.)  =  a.  >  O)  =  0.   To  prove  this,  notice  that  if 
there  were  an  atom  of  types  of  firm  j  playing  a. ,  firm  i  would  never  play  in 
[a.-e,a.)  for  e  small:   it  would  be  better  off  bidding  just  above  a.  (the 
proof  is  a  bit  loose  here,  but  can  be  made  rigorous).   Thus,  the  types  of 
firms  that  play  a.  would  be  better  off  playing  (a.-e),  because  this  would  not 
reduce  the  probability  of  winning  and  would  lead  to  reduced  payments. 

Let  us  look  for  a  strictly  monotonic,  continous  function  a.(t.)  with 
inverse  t.  =  ^-Ca.).  Thus,  ^jCa^^)  is  the  type  that  bids  a..  We  then  ob- 
tain: 

a. 
a.(t.)  e  arg  maz  (-a.  (l -P.C^-.Ca. ))  ]  +  L^(t . -a  .)p.  f$.(a  .)  l'^'.  (a  .)da  .  I . 

cL  > 

1 

By  differentiating,  one  obtains  a  system  of  two  differential  equations  in 


$.  (•)  and  iu(  •)  (or,  equivalently,  in  a.  (•)  and  a.A')).     Rather  then  doing 
so,  let  us  take  the  following  intuitive  approach:   If  firm  i,  with  type  t., 
bids  (a  +da.)  instead  of  a.,  it  loses  da.  with  probability  1  (since  there  is 
no  atom),  conditionally  on  firm  j  bidding  at  least  a.  (otherwise  this  in- 
crease has  no  effect).   It  gains  t.  (=  *.(a.))  with  probability  {p.($.(a.)) 
^Ua.  )/(l-P  (?>  (a.))  )}da.  .   Thus,  in  order  for  firm  i  to  be  indifferent: 
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$.(a.)p.f*.(a.))$*.(a.)  =  1-P.($.(a.))  . 

We  leave  it  to  the  reader  to  check  that,  for  a  symmetric  exponential 

-t. 

distribution  P.(t.)   =  1-e        ,    there  ezists  a  smmetric  equilibrium:      *.  (a.)   = 
11  11 

t^ 

/2a7,  which  corresponds  to  a.(t.)  =  — ^  (as  Riley  [l  980]  has  shovm,  there  also 

exists  a  continuum  of  asymmetric  equilibria: 

$^=  K/aj-  and  $2  =  I  /ij  for  K  >  O). 

Let  us  now  give  an  industrial  organization  interpretation  of  the  game. 
Suppose  that  there  are  two  firms  in  the  market;  they  both  lose  1  per  unit  of 
time  when  the  compete;  they  make  a  monopoly  profit  when  their  opponents  has 
left  the  market,  the  present  discounted  value  of  which  is  t.  (it  would  make 
sense  to  assume  that  the  duopoly  and  monopoly  profit  are  correlated,  but  such 
a  modification  would  hardly  change  the  results) •   The  firms  play  a  war  of 
attrition,   a.  is  the  time  firm  i  intends  to  stay  in  the  market,  if  firm  j 
has  not  exited  before.  At  this  stage,  the  reader  may  wonder  about  our  dynam- 
ic interpretation:   if  firms  are  free  to  leave  when  they  want  and  are  not 
committed  to  abide  by  their  date  0  choice  of  a. ,  is  the  Bayesian  equilibrium 
"perfect"?  It  turns  out  that  the  answer  is  "yes";  the  dynamic  game  is  essen- 
tially a  static  game  (which  is  the  reason  why  we  chose  to  present  it  in  this 
section).  At  any  time  a.,  either  firm  j  has  dropped  out  (bid  less  then  a.) 
and  the  game  is  over,  or  firm  j  is  still  in  the  market  and  the  conditional 
probability  of  exit  is  the  one  computed  earlier.   Thus  the  equilibrium  is 
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perfect  as  well. ^ 


3B.   Using  Bayeaian  Equililiria  to  Justify  Mixed  Equilibria 

In  Section  1  ,  we  saw  that  simultaneous  move  games  of  complete  informa- 
tion often  admit  mixed  strategy  equilibria.   Some  researchers  are  unhappy 
with  this  notion  because,  they  argue,  "real  world  decision  makers  do  not  flip 
a  coin."  However,  as  Harsanyi  [l  973]  has  shown,  mixed  strategy  equilibria  of 
complete  information  games  can  often  be  vindicated  as  the  limits  of  pure 
strategy  equilibria  of  slightly  perturbed  games  of  incomplete  information. 
Indeed,  we  have  already  noticed  that  in  a  Bayesian  game,  once  the  players' 
type-contingent  strategies  have  been  computed,  each  player  behaves  as  if  he 
were  facing  mixed  strategies  by  his  opponents  (nature  creates  uncertainty 
through  its  choice  of  types  rather  then  the  choice  of  the  side  of  the  coin) . 

To  illustrate  the  mechanics  of  this  construction,  let  us  consider  the 
one-period  version  of  the  "grab-the-dollar"  game  introduced  in  Section  2. 
Each  player  has  tv7o  possible  actions:   investment,  no  investment.   In  the 
complete  information  version  of  the  game,  a  firm  gains  1  if  it  is  the  only 
one  to  make  the  investment  (wins),  loses  1  if  both  invest,  and  breaks  even  if 
it  does  not  invest.   (We  can  view  this  game  as  an  extremely  crude  representa- 
tion of  a  natural  monopoly  market.)  The  only  symmetric  equilibrium  involves 
mixed  strategies:   each  firm  invests  with  probability  1/2.   This  clearly  is 
an  equilibrium:   each  firm  makes  0  if  it  does  not  invest,  and  -^  (l)+-p  (-1) 


^The  war  of  attrition  was  introduced  in  the  theoretical  biology  litera- 
ture (e.g.,  Maynard  Smith  [1974],  Riley  [l  980]  and  has  known  many  appli- 
cations since.   It  was  introduced  in  industrial  organization  by  Kreps- 
Wilson  [l982a].   (See  also  Nalebuff  [l  982]  and  Ghemewat-Nalebuff 
[1985]).   For  a  characterization  of  the  set  of  equilibria  and  a  unique- 
ness result  with  changing  duopoly  payoffs  and/or  large  uncertainty  over 
types,  see  Fudenberg-Tirole  [1986].   See  also  Hendricks  and  Wilson 
[l985a,  1985b]. 
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=  0  if  it  does  not.   Now  consider  the  same  game  with  the  following  type  of 
incomplete  information:   Each  firm  has  the  same  payoff  structure  except  that, 
when  it  wins,  it  gets  (l+t)  where  t  is  uniformly  distributed  on 
[-e,  +  e:].   Each  firm  knows  its  type  t,  but  not  that  of  the  other  firm.   Now, 
it  is  easily  seen  that  the  symmetric  pure  strategies:  "a(t  <  O)  =  do  not 
invest,  a(t  >  O)  =  invest"  form  a  Bayesian  equilibrium.   From  the  point  of 
view  of  each  firm,  the  other  firm  invests  with  probability  1/2.   Thus,  the 
firm  should  invest  if  and  only  if  -^  (l+t)  +  1-  (-1 )  >  0,  i.e.,  t  >  0.   Last, 
note  that,  when  e  converges  to  zero,  the  pure  strategy  Bayesian  equilibrium 
converges  to  the  mixed  strategy  Nash  equilibrium  of  the  complete  information 
game. 

As  another  example,  the  reader  may  want  to  study  the  sjTometric  war  of 
attrition.  Under  complete  information  and  symmetric  payoffs,  it  is  easily 
shown  that  in  a  symmetric  equilibrium,  each  player's  strategy  is  a  mixed 
strategy  with  exponential  distribution  over  possible  times.'  The  symmetric 
incomplete  information  equilibrixm  (computed  in  Section  3A)  converges  to  this 
mixed  strategy  equilibrium  when  the  uncertainty  converges  to  zero  (see 
Milgrom-¥eber  for  the  case  of  a  uniform  distribution) . 

Milgrom-Weber  [l  985]  offers  sufficient  (continuity)  conditions  on  the 
objective  functions  and  information  structure  so  that  the  limit  of  Bayesian 
equilibrium  strategies  when  the  uncertainty  becomes  "negligible,"  forms  a 
Nash  equilibrium  of  the  limit  complete  information  game.   (Note:   the  war  of 
attrition  does  not  satisfy  their  continuity  conditions;  but  as  Milgrom  and 


'Letting  t  denote  the  common  payoff  to  winning,  waiting  da  more  yields 
(x(a)t}da  where  x(a)da  is  the  probability  that  the  opponent  drops  be- 
tween a  and  (a+da).  This  must  equal  the  cost  of  waiting:   da.   Thus, 
x(a)  =  1/t  is  independent  of  time  a. 
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Weber  show,  the  result  holds  anyway).   They  also  identify  a  class  of  (atom- 
less)  games  for  which  there  exists  a  pure  strategy  equilibrium. 

We  must  realize  that  games  of  complete  information  are  an  idealization. 
In  practice,  everyone  has  at  least  a  slight  amount  of  incomplete  information 
about  the  others'  objectives;  Harsanyi's  argument  shows  that  it  is  hard  to 
make  a  strong  case  against  mixed  strategy  equilibria  on  the  grounds  that  they 
require  a  randomizing  device. 

4.   Dynamic  Games  of  Incomplete  Information 

We  now  study  games  in  which,  at  some  point  of  time,  a  player  bases  his 
decision  on  a  signal  that  conveys  information  about  another  player.   This 
type  of  game  is  dynamic  in  that  a  player  reacts  to  another  player's  move. 
The  tricky  aspect  of  it  is  that,  under  incomplete  information,  the  former 
must  apply  Bayes  rule  to  update  his  beliefs  about  the  latter' s  type.   To  do 
so,  he  uses  the  latter's  choice  of  action  (or  a  signal  of  it)  and  equilibrium 
strategy,  as  we  shall  see  shortly.   The  equilibrium  notion  for  dynamic  games 
of  incomplete  information  is  naturally  a  combination  of  the  subgame  perfect 
equilibrium  concept  that  we  discussed  earlier  and  Harsanyi  [l967]'s  concept 
of  Bayesian  equilibrium  for  games  of  incomplete  information.   In  this  section 
we  consider  the  simplest  such  notion,  that  of  perfect  Bayesian  equilibrium 
concept,  as  well  as  some  easy-to-apply  (and  sometimes  informal)  refinements. 
In  the  next  section,  we  will  discuss  more  formal  refinements  of  the  perfect 
Bayesian  equilibrium  concept  for  finite  games. 

The  notion  of  a  perfect  Bayesian  equilibrium  was  developed  under  various 
names  and  in  various  contexts  in  the  late  'sixties  and  the  'seventies.   In 
economics,  Akerlof  [I  970]  and  Spence  [l974]'s  market  games  and  Ortega- 
Reichert  [l967]'s  analysis  of  repeated  first-bid  auctions  make  implicit  use 
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of  the  concept.   In  industrial  organization  the  first  and  crucial  application 
is  Milgrom-Roberts  [l982a]'s  limit  pricing  paper,  followed  by  the  work  of 
Kreps-Wilson  [l  982a]  -Milgrom-Roberts  [l  982b]  on  reputation.   In  game  theory, 
Selten  [l  975]  introduced  the  idea  of  trembles  to  refine  the  concept  of  sub- 
game  perfect  equilibria  in  games  without  (many)  proper  subgames.   (if  each 
player's  type  is  private  information,  the  only  proper  subgame  is  the  whole 
game,  so  subgame  perfection  has  no  force).  Kreps-Wilson  [l982b]'s  sequential 
equilibrium  is  similar,  but,  in  the  tradition  of  the  economics  literature,  it 
emphasizes  the  formation  of  beliefs,  which  makes  the  introduction  of  refine- 
ments easier  to  motivate.  We  should  also  mention  the  work  of  Aumaim-Machler 
[l  967]  on  repeated  games  of  incomplete  information. 

We  start  this  section  with  the  simplest  example  of  a  dynamic  game  of 
incomplete  information,  the  signaling  game.   This  is  a  two-period  leader- 
follower  game  in  which  the  leader  is  endowed  with  private  information  that 
affects  the  follower.  We  give  some  examples  of  such  games  and  introduce  some 
refinements  of  the  equilibrium  concept.  As  the  principles  enunciated  here 
for  signaling  games  generally  carry  over  to  general  games,  we  do  not  treat 
the  latter  in  order  to  save  on  notation  and  space. 

4A.   The  Basic  Signaling  Game 

As  mentioned  earlier,  the  simplest  game  in  which  the  issues  of  updating 
and  perfection  arise  simultaneously  has  the  following  structure:   There  are 
two  players;  player  1  is  the  leader  (also  called  "sender",  because  he  sends  a 
signal)  and  player  2  the  follower  (or  "receiver").  Player  1  has  private 
information  about  his  type  t.  in  T.  ,  and  chooses  action  a.  in  A.  .  Player  2, 
whose  type  is  common  knowledge  for  simplicity,  observes  a.  and  chooses  ap  in 
Ag.  Payoffs  are  equal  to  11^^(3^  ,a2,t^ )  (i  =  1,2).   Before  the  game  begins. 
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player  2  has  prior  beliefs  p.  (t.  )  about  player  1  's  type. 

Player  2,  who  observes  player  1 's  move  before  choosing  his  own  action, 
should  update  his  beliefs  about  t.  and  base  his  choice  of  a„  on  the  posterior 
distribution  p(t,  |a, ).   How  is  this  posterior  formed?  As  in  a  Bayesian  equi- 
librium, player  1  's  action  ought  to  depend  on  his  type;  let  8.*{t    )  denote 
this  strategy  (as  before,  this  notation  allows  a  mixed  strategy).   Thus, 
figuring  out  a*(*)  and  observing  a,  ,  player  2  can  use  Bayes'  rule  to  update 
p  (•)  into  p  ('la  ).   And,  in  a  rational  expectations  world,  player  1  should 
anticipate  that  his  action  would  affect  player  2's  also  through  the  posterior 
beliefs.   Thus,  the  natural  extension  of  the  Nash  equilibrium  concept  to  the 
signaling  game  is: 

Definition;  A  perfect  Bayesian  equilibrium  (PBE)  of  the  signaling  game  is  a 
set  of  strategies  a*(t,  )  and  a-(a*)  and  posterior  beliefs  P^  ("t,  |a.  )  such 
that: 


(P^  )  a*(a^  )  e  arg  max  Z  p^  (t^  |a^  ^^2^^1  '^2'^1  ^ 

^2  ^1 

(P^)  a*(t^)  e  arg  max  F^ (a^ ,a*(a^ ) , t^  } 


^2 


(B)  p  (t  |a  )  is  derived  from  the  prior  p  (•),  a  and  a*(  •)  using  Bayes'  rule 
(when  applicable) . 

(p.)  and  (Pj)  are  the  perfectness  conditions.   (P.  )  states  that  player  2 
reacts  optimally  to  player  1's  action  given  his  posterior  beliefs  about  t. . 
(P2)  demonstrates  the  optimal  Stackelberg  behavior  by  player  1  ;  note  that  he 
takes  into  account  the  effect  of  a,  on  player  2's  action.   (B)  corresponds  to 
the  application  of  Bayes'  rule.   The  quantifier  "when  applicable"  stems  from 
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the  fact  that,  if  a.  is  not  part  of  player  1 's  optimal  strategy  for  some 
type,  observing  a.  is  a  zero-probability  event  and  Bayes  rule  does  not  pin 
down  posterior  beliefs.  Any  posterior  beliefs  p  ('la  )  are  then  admissible. 
Indeed,  the  purpose  of  the  refinements  of  the  perfect  Bayesian  equilibrium 
concept  is  to  put  some  restrictions  on  these  posterior  beliefs. 

Thus,  a  PBE  is  simply  a  set  of  strategies  and  beliefs  such  that,  at  any 
stage  of  the  game,  stategies  are  optimal  given  beliefs  and  beliefs  are  ob- 
tained from  equilibrium  strategies  and  observed  actions  using  Bayes'  rule. 
Two  features  of  the  concept  developed  thus  far  should  be  emphasized: 

First,  a  PBE  has  a  strong  fized  point  flavor.   Beliefs  are  derived  from 
strategies,  which  are  optimal  given  beliefs.   For  this  reason,  there  exists 
no  handy  algorithm  to  help  us  construct  equilibria.  Remember  that  for  games 
of  complete  information,  Kuhn's  algorithm  of  backward  induction  gave  us  the 
set  of  perfect  equilibria.  Here  we  must  also  operate  the  Bayesian  updating 
in  a  forward  manner.   This  makes  the  search  for.  equilibria  rely  on  a  few 
tricks  (to  be  discussed  latter)  rather  than  on  a  general  method. 

Second,  too  litle  structure  has  been  imposed  on  the  type  and  action 
spaces  and  on  the  objective  functions  to  prove  existence  of  a  PBE.   Actually, 
existence  theorems  are  available  only  for  games  with  a  finite  number  of  types 
and  actions  (see  subsection  4E).   Most  applications,  however,  involve  either 
a  continuum  of  types  or/and  a  continuum  of  actions.   Existence  is  then  ob- 
tained by  construction,  on  a  case  by  case  basis. 

For  more  general  games  than  the  signaling  game,  the  definition  of  a  PBE 
is  the  same:  At  each  information  set  posterior  beliefs  are  computed  using 
optimal  strategies  and  the  information  at  the  information  set.  And,  strate- 
gies are  optimal  given  beliefs.  ¥e  will  not  give  the  formal  definition  of 
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this  because  it  involves  nothing  more  than  a  (very  hea-vy)  extension  of  the 
notation. 

Let  us  now  give  simple  examples  of  PBE  in  signaling  games.   From  now  on, 
we  delete  the  subscript  on  player  1's  type,  as  there  is  no  possible  confu- 
sion. 

4B.   Examples 

Example  1  ;   a  two-period  reputation  game.   The  following  is  a  much  simplified 
version  of  the  Kreps-Wilson-Milgrom-Roberts  reputation  story.   There  are  two 
firms  (i  =  1,2).   In  period  1,  they  are  both  in  the  market.   Only  firm  1  (the 
"incumbent")  takes  an  action  a..   The  action  space  has  two  elements:   "prey" 
and  "accomodate."  Firm  2  (the  "entrant") 's  profit  is  D  if  firm  1  accomo- 
dates and  Pp  if  firm  1  preys,  such  that  Dp  >  0  >  Pp.   Firm  1  has  one  of  two 
potential  types  t.  :  "sane"  and  "crazy."  When  sane,  firm  1  makes  D.  when  it 
accomodates  and  P.  when  it  preys,  where  D.  >  P..   Thus,  a  sane  firm  prefers 
to  accomodate  rather  then  preying.   However,  it  would  prefer  to  be  a  monopo- 
ly, in  which  case  it  would  make  M.  per  period.  When  crazy,  firm  1  enjoys 
predation  and  thus  preys  (its  utility  function  is  such  that  it  is  always 
worth  preying).   Let  p.  (respectively,  (l-p^))  denote  the  prior  probability 
that  firm  1  is  sane  (respectively,  crazy). 

In  period  2,  only  firm  2  chooses  an  action  ap.  This  action  can  take  two 
values:   "stay"  and  "exit."  If  it  stays,  it  obtains  a  payoff  Dp  if  firm  1  is 
actually  sane,  and  Pp  if  it  is  crazy  (the  idea  is  that  unless  it  is  crazy, 
firm  1  will  not  pursue  any  predatory  strategy  in  the  second  period  because 
there  is  no  point  building  or  keeping  a  reputation  at  the  end.   This  assump- 
tion can  be  derived  more  formally  from  the  description  of  the  second-period 
competition).   The  sane  firm  gets  D.  if  firm  2  stays  and  M.  >  D.  if  firm  2 
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exits.  We  let  6  denote  the  discount  factor  between  the  two  periods. 

We  presumed  that  the  crazy  type  always  preys.   The  interesting  thing  to 
study  is  thus  the  sane  type's  behavior.   From  a  static  point  of  view,  it 
would  want  to  accomodate  in  the  first  period;  however,  by  preying  it  might 
convince  firm  2  that  it  is  of  the  crazy  type,  and  thus  induce  exit  (as 
Pp  <  O)  and  increase  its  second-period  profit. 

Let  us  first  start  with  a  taxonomy  of  potential  perfect  Bayesian  equili- 
bria. A  separating  equilibrium  is  an  equilibrium  in  which  firm  1 's  two  types 
choose  two  different  actions  in  the  first  period.   Here,  this  means  that  the 
sane  type  chooses  to  accomodate.  Note  that  in  a  separating  equilibrium,  firm 
two  has  complete  information  in  the  second  period: 

p. (t  =  sane|a.  =  accomodate)  =  1  and  p.  (t  =  crazy|a.  =  prey)  =  1. 

A  pooling  equilibrium  is  an  equilibrium  in  which  firm  1 's  two  types  choose 
the  same  action  in  the  first  period.   Here,  this  means  that  the  sane  type 
preys.   In  a  pooling  equilibrium,  firm  2  does  not  update  its  beliefs  when 
observing  the  equilibriim  action:   p.  (t  =  sane  I  a  =  prey)  =  p  .   Last,  there 
can  also  exist  hybrid  or  semi-separating  equilibria.  For  instance,  in  the 
reputation  game,  the  sane  type  may  randomize  between  preying  and  accomodat- 
ing, i.e.,  between  pooling  and  separating.   One  then  has 

p  (t  =  sane  I  a  =  prey)  t   (0,p  )  and  p  (t  =  sane  |  a  =  accomodate)  =  1 . 

Let  us  first  look  for  conditions  of  existence  of  a  separating  equili- 
brium.  In  such  an  equilibrium,  the  sane  type  accomodates  and  thus  reveals 
its  type  and  obtains  D. (1+6)  (firm  2  stays  because  it  expects  D-  >  0  in  the 
second  period).   If  it  decided  to  prey,  it  would  convince  firm  2  that  it  is 
crazy  and  would  thus  obtain  P. +6IL  .   Thus,  a  necessary  condition  for  the 
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existence  of  a  separating  equilibriiom  is: 

(6)  6(M^-D^)  <  (D^-Pi). 

Conversely,  suppose  that  (6)  is  satisfied.  Consider  the  following  strategies 
and  beliefs:  the  sane  incumbent  accomodates,  and  the  entrant  (correctly) 
anticipates  that  the  incumbent  is  sane  when  observing  accomodation;  the  crazy 
incumbent  preys  and  the  entrant  (correctly)  anticipates  that  the  incumbent  is 
crazy  when  observing  predation.  Clearly,  these  strategies  and  beliefs  form  a 
separating  PBE. 

Let  us  now  look  at  the  possibility  of  a  pooling  equilibrium.   Both  types 
prey;  thus,  as  we  saw,  p  =  p,  when  predation  is  observed.   How,  the  sane 
type,  who  loses  (D. -P. )  is  the  first  period,  must  induce  exit.   Thus,  it  must 
be  the  case  that: 

(7)  p^D2+(l-p^)P2  <  0. 

Conversely,  assume  that  (7)  holds,  and  consider  the  following  strategies  and 
beliefs:   both  types  prey;  the  entrant  has  posterior  beliefs  p.  =  p.  when 
predation  is  observed  and  p  =  1  when  accomodation  is  observed.   The  sane 
type's  equilibrium  profit  is  P.+6M.  while  it  would  become  D, (1+6)  under  ac- 
comodation.  Thus,  if  (6)  is  violated,  the  proposed  strategies  and  beliefs 
form  a  pooling  PBE  (note  that  if  (7)  is  satisfied  with  equality,  there  exists 
not  one,  but  a  continuum  of  such  equilibria).   So  the  equilibrium  is  that  the 
entrant  never  enters  and  the  incumbent  never  preys. 

¥e  leave  it  to  the  reader  to  check  that  if  both  (6)  and  (7)  are  violat- 
ed, the  unique  equilibrium  is  a  hybrid  PBE  (with  the  entrant's  randomizing 
when  observing  predation). 
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Remark;   The  (generic)  uniqueness  of  the  PBE  in  this  model  is  due  to  the  fact 
that  the  "strong"  type  (the  crazy  incumbent)  is  assumed  to  prey  no  matter 
what.   Thus,  predation  is  not  a  zero  probability  event  and,  furthermore, 
accomodation  is  automatically  interpreted  as  coming  from  the  sane  type  if  it 
belongs  to  the  equilibrium  path.   The  next  example  illustrates  a  more  complex 
and  a  more  common  structure,  for  which  refinements  of  the  PBE  are  required. 
Example  2,  which,  in  many  respects,  can  be  regarded  as  a  generalization  of 
example  1  also  involves  several  cases  resembling  those  in  example  1 . 

Example  2;   The  Limit-Pricing  Game 

As  mentioned  earlier,  the  paper  which  introduced  signaling  games  into 
the  industrial  organization  field  is  Milgrom-Roberts  [l982a]'s  article  on 
limit  pricing.   Let  us  take  the  following  simple  version  of  their  two-period 
model.  Firm  1 ,  the  incumbent,  has  in  the  first  period  a  monopoly  power  and 
chooses  a  first-period  quantity  a.  =  q^  •  Firm  2,  the  entrant,  then  decides 
to  enter  or  to  stay  out  in  the  second  period  (thus,  as  in  the  previous  game, 
ao  =  0  or  1  or  £:[o,l]  if  we  allow  mixed  strategies).   If  it  enters,  there  is 
duopolistic  competition  in  period  two.   Otherwise,  firm  1  remains  a  mono- 
poly. 

Firm  1  can  have  one  of  two  potential  types:  its  constant  unit  production 
cost  is  "high"  (h)  with  probability  p^  and  "low"  (L)  with  probability  (l-p^). 
We  will  denote  by  q^  the  monopoly  quantities  for  the  two  types  of  incumbent 
(t  =  H,L).  Naturally,  q  <  q^.  ¥e  let  M!'(q.)  denote  the  monopoly  profit  of 
type  t  when  producing  q. ;   in  particular,  let  M.  =  M. (q  )  denote  type  i's 
monopoly  profit  when  it  maximizes  its  short-r\in  profit.  ¥e  assume  that 
M.  (q  )  is  strictly  concave  in  q.  . 
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Finn  1  knows  t  from  the  start;  firm  2  does  not.   Let  D  denote  firm  2's 
duopoly  profit  when  firm  1  has  type  t  (it  possihly  includes  entry  costs).   To 

make  things  interesting,  let  us  assume  that  firm  2's  entry  decision  is  influ- 

H        T 
enced  by  its  beliefs  about  firm  1  's  type:   D  >  0  >  D^.   The  discount  factor 

is  6. 

Let  us  look  for  separating  equilibria.   For  this,  we  first  obtain  two 
necessary  conditions:   that  each  type  does  not  want  to  pick  the  other  type's 
equilibrium  action  ("incentive  constraints").  We  then  complete  the  descrip- 
tion of  equilibrium  by  choosing  beliefs  off-the-equilibrium  path  that  deter 
the  two  types  from  deviating  from  their  equilibrium  actions.   Thus,  our 
necessary  conditions  are  also  sufficient,  in  the  sense  that  the  corresponding 
quantities  are  equilibrium  quantities.   In  a  separating  equilibrium,  the 
high-cost  type's  quantity  induces  entry.  He  thus  plays  q  (if  it  did  not,  he 
could  increase  his  first-period  profit  without  adverse  effect  on  entry). 
Thus,  he  gets  {M.+6  D^ } .   Let  q.  denote  the  output  of  the  low-cost  type.   The 
high-cost  type,  by  producing  this  output,  deters  entry  and  obtains 

TT     T  U 

(M^ (q. )+6K.  }•   Thus,  a  necessary  condition  for  equilibrium  is: 

(8)  M^-M^(q^)  >   6{K^-D^). 

The  similar  condition  for  the  low-cost  type  is: 

(9)  M|'-M5'(q^)  <  6{k]-D^). 

To  make  things  interesting,  we  will  assxime  that  there  is  no  (separating) 
equilibrium  in  which  each  type  behaves  as  in  a  full  information  context; 
i.e.,  the  low-cost  type  would  wish  to  pool: 
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(10)      M!i^-M"(q^)  <  6(M^-d5^). 

To  characterize  the  set  of  q.  satisfying  (8)  and  (9),  one  must  make  more 
specific  assumptions  on  the  demand  and  cost  functions.  We  will  not  do  it 
here,  and  we  refer  to  the  literature  for  this.  We  just  note  that,  under 
reasonable  conditions,  (8)  and  (9)  define  a  region  [q.,q.],  where  q^  >  q^. 
Thus,  to  separate,  the  low-cost  type  must  produce  sufficiently  above  its 
monopoly  quantity  so  as  to  make  pooling  very  costly  to  the  high-cost  type.   A 
crucial  assumption  in  the  derivation  of  such  an  interval  is  the  Spence- 
Mirrlees  (single-crossing)  condition: 


^(»XKu,))>o. 


qL  is  such  that  (8)  is  satisfied  with  equality;  it  is  called  the  "least-cost" 
separating  quantity,  because,  of  all  potential  separating  equilibria,  the 
low-cost  type  would  prefer  the  one  at  q.  . 

Let  us  now  show  that  these  necessary  conditions  are  also  sufficient. 
Let  the  high  cost  type  choose  q  and  the  low-cost  type  choose  q.  m  Ltl-|»q-|J' 
When  a  quantity  that  differs  from  these  two  quantities  is  observed,  beliefs 
are  arbitrary.  The  easiest  way  to  obtain  equilibrium  is  to  choose  beliefs 
that  induce  entry;  this  way,  the  two  types  will  be  little  tempted  to  deviate 
from  their  presumed  equilibrium  strategies;  so  let  us  specify  that  when  q. 
does  not  belong  to  {q^,q>},  p.  =  1  (firm  2  believes  firm  1  has  high-cost); 
whether  these  beliefs,  which  are  consistent  with  Bayes'  rule,  are  "reason- 
able," is  discussed  later  on.  Now,  let  us  check  that  no  type  wants  to  devi- 
ate.  The  high-cost  type  obtains  its  monopoly  profits  in  the  first  period 
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and,  thus,  is  not  willing  to  deviate  to  another  quantity  that  induces  entry. 
He  does  not  deviate  to  q  either  from  (8).  And  similarly  for  the  low-cost 
type.   Thus,  we  have  obtained  a  continuum  of  separating  equilibria. 

Note  that  this  continuum  of  separating  equilibria  exists  for  any  p^  >  0. 
By  contrast,  for  p^  =  0,  the  low-cost  firm  plays  its  monopoly  quantity  q  . 
We  thus  observe  that  a  tiny  change  in  the  information  structure  may  make  a 
huge  difference.   A  very  small  probability  that  the  firm  has  high  cost  may 
force  the  low  cost  firm  to  increase  its  production  discontinuously  to  signal 
its  type.   Games  of  incomplete  information  (which  include  games  of  complete 
information!)  are  very  sensitive  to  the  specification  of  the  information 
structure,  a  topic  we  will  come  back  to  later  on. 

Note  also  that  Pareto  dominance  selects  the  least  cost  separating  equi- 
librium among  separating  equilibria.   [The  entrant  has  the  same  utility  in 
all  separating  equilibria  (the  informative  content  is  the  same);  similarly, 
the  high  cost  type  is  indifferent.   The  low  cost  type  prefers  lower  out- 
puts] . 

The  existence  of  pooling  equilibria  hinges  on  whether  the  following 
condition  is  satisfied. 

(11)      p^D^+(l-p^)D^  <  0. 

Assume  that  condition  (ll)  is  violated  (with  a  strict  inequality  —  we 
will  not  consider  the  equality  case  for  simplicity) .   Then,  at  the  pooling 
quantity,  firm  2  makes  a  strictly  positive  profit  if  it  enters  (as  p  =  p.  ). 
This  means  that  entry  is  not  deterred,  so  that  the  two  types  can  not  do  bet- 
ter then  choosing  their  (static)  monopoly  outputs.  As  these  outputs  differ, 
no  pooling  equilibrixan  can  exist. 

Assume,  therefore,  that  (ll)  is  satisfied  so  that  a  pooling  quantity  q. 
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deters  entry.  A  necesary  condition  for  a  quantity  q.  to  be  a  pooling  equili- 
brium quantity  is  that  none  of  the  types  want  to  play  his  static  optimum.   If 
he  were  to  do  so,  it  would  at  worse  deter  entry.  Therefore,  q.  must  satisfy 
(9)  and  the  analogous  condition  for  the  high-cost  type: 

(12)      Mlf-M!^(q^)  <  6(M5^-Df). 

Again,  the  set  of  outputs  q.  that  satisfy  both  (9)  and  (12)  depends  on 
the  cost  and  demand  functions.  Let  us  simply  notice  that,  from  (10),  there 
exists  an  interval  of  outputs  around  q  that  satisfy  these  two  inequalities. 

Now  it  is  easy  to  see  that  if  q^  satisfies  (9)  and  (12),  q^  can  be  made 
part  of  a  pooling  equilibrium.   Suppose  that  whenever  firm  1  plays  an  output 
differing  from  q.  (an  off-the-equilibrium  path  action),  firm  2  believes  firm 
1  has  a  high  cost.  Firm  2  then  enters,  and  firm  1  might  as  well  play  its 
monopoly  output.  Thus,  from  (9)  and  (12),  none  of  the  types  would  want  to 
deviate  from  q.  . 

¥e  leave  it  to  the  reader  to  derive  hybrid  equilibria  (the  analysis  is 
very  similar  to  the  previous  ones).  We  now  investigate  the  issue  of  refine- 
ments. 

4C.  Some  Refinements 

Games  of  incomplete  information  in  general  have  many  PBE.   The  reason 
why  this  is  so  is  easy  to  grasp.   Consider  the  basic  signaling  game  and  sup- 
pose that  one  wants  to  rule  out  some  action  a^  by  player  1  as  an  equilibrium 
action.   If,  indeed,  a.  is  not  played  on  the  equilibriim  path,  player  2's 
beliefs  following  a.  are  arbitrary.   In  most  games  there  exists  some  type  t 
such  that  if  player  2  puts  all  the  weight  on  t,  it  takes  an  action  that  is 
detrimental  for  all  types  of  player  1  (for  instance,  t  is  the  high  cost  type 
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in  the  limit  pricing  game;  it  induces  entry).  As  playing  a.  produces  a  bad 
outcome  for  player  1 ,  not  playing  a.  on  the  equilibrium  path  may  be  self- 
fulfilling.   Some  authors  have  noted  that,  while  non  credible  actions  were 
ruled  out  by  the  perfectness  part  of  the  PBE,  players  could  still  "threaten" 
each  other  through  beliefs.   This  subsection  and  4D  discuss  refinements  that 
select  subsets  of  PBEs. 

Often,  however,  the  very  structure  of  the  game  tells  us  that  some  be- 
liefs, while  allowable  because  off-the-equilibrium  path,  "do  not  make  sense." 
Over  the  years  intuitive  criteria  for  selection  of  beliefs  have  been  devel- 
oped for  each  particular  game.  V/e  mention  here  only  a  few  of  these  criteria. 
These  criteria,  which  apply  to  all  types  of  games  (including  games  with  a 
continuum  of  types  or  actions),  are  sometimes  informal  in  that  they  have  not 
been  designed  as  part  of  a  formal  solution  concept  for  which  existence  has 
been  proved.  But  most  of  them  are,  for  finite  games,  satisfied  by  the 
Kohlberg-Mertens  [l  986]  concept  of  stable  equilibria,  which  are  known  to 
exist  (see  subsection  4E  below).   Last,  we  should  warn  the  reader  that  the 
presentation  below  resembles  more  a  list  of  cookbook  receipes  than  a  unified 
methodological  approach. 

i)  Elimination  of  Weakly  Dominated  Strategies 

In  the  tradition  of  Luce-Raiffa  [1957],  Farquharson  [1969],  Moulin 
[1979],  Bernheim  [1984],  and  Pierce  [1984],  it  seems  natural  to  require  that, 
when  an  action  is  dominated  for  some  type,  but  not  for  some  other,  the  pos- 
terior beliefs  should  not  put  any  weight  on  the  former  type.   This  simple 
restriction  may  already  cut  on  the  number  of  PBE  considerably.   Consider  the 
limit  pricing  game.   Quantities  above  q.  are  dominated  for  the  high  cost  type 
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H  H   H 

(if  this  type  chooses  q  ,  its  intertemporal  profit  is  M  +6D.  ;  if  it  chooses 

q  ,  this  profit  does  not  exceed  M. (q. )+6  M,  ;  from  the  definition  of  q, ,  the 
second  action  is  weakly  dominated  for  q  >  q. ) •  Thus,  when  q  belongs  to 
[.%   »?i  ]»  "the  entrant  should  believe  that  the  incumbent's  cost  is  low,  and 
should  not  enter.   Thus,  the  low  cost  incumbent  need  not  produce  above  q.  to 
deter  entry.   We  thus  see  that  we  are  left  with  a  single  separating  PBE  in- 
stead of  a  continuum  (this  reasoning  is  due  to  Milgrom- Roberts). 

A  small  caveat  here:   playing  a  quantity  above  q,  is  dominated  for  the 
high  cost  type  only  once  the  second  period  has  been  folded  back.   Before 
that,  one  can  think  of  (non-equilibrium)  behavior  which  would  not  make  such  a 
quantity  a  dominated  strategy.  For  instance,  following  q  ,  the  entrant  might 
enter  and  charge  a  very  low  price.   So,  we  are  invoking  a  bit  more  than  the 
elimination  of  dominated  strategies.  A  quantity  above  q.  is  dominated  condi- 
tional on  subsequent  equilibrium  behavior  —  a  requirement  in  the  spirit  of 
perfectness.  More  generally,  one  will  want  to  iterate  the  elimination  of 
weakly  dominated  strategies. 

Note  that,  in  the  limit  pricing  game,  the  elimination  of  weakly  domin- 
ated strategies  leaves  us  with  the  "least  cost"  separating  equilibrium,  but 
does  not  help  us  select  among  the  pooling  equilibria.   This  is  because  the 
equilibrium  pooling  quantities  are  not  dominated  for  the  high  cost  type. 

ii)  Elimination  of  Equilibrium  Weakly  Dominated  Strategies  (intuitive 
Criterion) 

The  next  criterion  was  proposed  by  Kreps  [l  984]  to  single  out  a  property 

satisfied  by  the  more  stringent  stability  requirement  of  Kohlberg-Mertens 

[1  986]  and,  thus,  to  simplify  its  use  in  applications  of  game  theory.   The 
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idea  is  roughly  to  extend  the  elimination  of  weakly  dominated  strategies  to 
strategies  which  are  dominated  relative  to  equilibrium  payoffs.   So  doing 
eliminates  more  strategies  and  thus  refines  the  equilibrium  concept  further. 

More  precisely,  consider  the  signaling  game  and  a  corresponding  PBE  and 
associated  payoffs.   Let  a.  denote  an  out  of  equilibrium  action  which  yields 
for  a  subset  J  of  types  payoffs  lower  than  their  equilibrium  payoffs  whatever 
beliefs  player  2  forms  after  observing  a, . 

More  formally,  let  F*(t)  denote  player  1 's  equilibrium  payoff  when  he 

has  type  t.   Let  BR(p  ,a  )  =  arg  max  (Zp  (t)n  (a  ,a  ,t) }  denote  player  2's 

a^sA^ 

best  response(s)  when  he  has  posterior  beliefs  p, (•);  and  let  BR(l,a  )  = 
U        BR(p  ,a  )  denote  the  set  of  player  2's  best  responses  when  his 

posterior  beliefs  put  all  the  weight  in  a  subset  I  of  types. 

Suppose  that  there  exists  a  subset  J  of  T  such  that: 

(1)  For  all  t  in  J  and  for  all  a^   in  BR(T,a^  ) ,  11^  (a^  ,a2,t)  <  rT^(t)  . 

(2)  There  exists  a  type  t  in  T-J  such  that  for  all  ap  in  BR(T-J,a.  ), 

n^(a^,a2,t)  >  n*(t). 

From  condition  (1),  we  know  that  no  type  in  J  would  want  to  deviate  from 
his  equilibrium  path,  whatever  inference  player  2  would  make  following  the 
deviation.   It  thus  seems  logical  that  player  2  does  not  put  any  weight  on 
types  in  J.  But,  one  would  object,  no  type  outside  J  may  gain  from  the  devi- 
ation either.   This  is  why  condition  (2)  is  imposed.   There  exists  some  type 
outside  J,  this  type  strictly  gains  from  the  deviation.   The  intuitive  cri- 
terion rejects  PBE  that  satisfy  (1)  and  (2)  for  some  action  a^  and  some  sub- 
set J. 
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One  immediately  sees  that  this  criterion  has  most  power  when  there  are 
only  two  potential  types  (see  below  for  an  application  to  the  limit  pricing 
game).   The  subsets  J  and  T-J  of  the  criterion  are  then  necessarily  composed 
of  one  type  each.   Thus,  the  requirement  "for  all  a„  in  BR(T-J,a. ) . . •"  in 
condition  2  is  not  too  stringent,  and  the  criterion  has  much  cutting  power. 
With  more  than  two  types,  however,  there  may  exist  many  a^  in  BR(T-J,a. ) 
and,  therefore,  the  requirement  that  some  type  prefers  the  deviation  for  all 
a_  in  BR(T-J,a.  )  becomes  very  strong.   The  refinement  then  loses  some  of  its 
power. 

Cho  [1985]  and  Cho-Kreps  [1  987]  invert  the  quantifiers  in  condition  (2), 
which  becomes: 

(2')  For  all  action  ap  in  BR(T-J,a. ),  there  exists  t  such  that 

n^Ca^.a^.t)  >  n*(t). 

In  other  words,  whatever  beliefs  are  formed  by  player  2  which  do  not  put 
weight  on  J,  there  exists  some  type  (in  T-j)  who  would  like  to  deviate. 
Condition  (2')  is  somewhat  more  appealing  than  condition  (2),  as  if  (2')  is 
satisfied,  the  players  can  not  think  of  any  continuation  equilibrium  which 
would  satisfy  (I)  and  deter  any  deviation.  By  contrast,  condition  (2),  ex- 
cept in  the  two-type  case,  allows  continuum  equilibria  that  satisfy  (I)  and 
such  that  no  player  in  (T-J)  wants  to  deviate  from  equilibrium  behavior. 

Cho  and  Cho-Kreps 's  "communicational  equilibrium"  is  a  PBE  such  that 
there  does  not  exist  an  off-the-equilibrium  action  a.  and  a  subset  of  types  J 
that  satisfy  (I )  and  (2').  Banks  and  Sobel  [1  985]  identify  a  condition  that 
is  equivalent  to  (2');  they  require  (among  other  things)  that  player  2's  off- 
the-equilibri\im  path  beliefs  place  positive  probability  only  on  player  1  's 
types  who  might  not  lose  from  a  defection.   They  go  on  to  define  the  concept 
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of  "divine  equilibrium."  A  divine  equilibium  thus  satisfies  the  Cho-Kreps 
criterion  and,  for  finite  games,  exists  (because  it  is  stable). 

We  should  also  mention  the  work  by  Farrell  [l  984]  and  Grossman  and  Perry 
[l  986a, b]  who  offer  a  criterion  similar  to,  but  stronger  than,  the  intuitive 
criterion.   In  a  signaling  game  their  criterion  roughly  says  that,  if  there 
exists  a  deviation  a^  and  a  set  of  types  J  such  that  if  the  posterior  beliefs 
are  the  same  as  the  prior  truncated  to  (T-J) ,  types  in  J  (respectively,  in 
(T-J))  lose  (respectively,  gain)  relative  to  their  equilibrium  payoffs,  the 
initial  equilibrium  is  not  acceptable.   This  requirement  is  stronger  than  the 
Cho-Kreps  criterion  because,  in  particular,  it  does  not  allow  any  leeway  in 
specifying  posterior  beliefs  within  the  support  (T-J).   The  refinement,  how- 
ever, is  so  strong  that  equilibrium  may  not  exist;  so  it  is  restricted  to  a 
given  (and  yet  unknown)  class  of  games. 

Let  us  now  apply  the  intuitive  criterion  to  the  limit  pricing  game.  As 
the  intuitive  criterion  is  stronger  than  iterated  elimination  of  weakly  domi- 
nated strategies,  we  get  at  most  one  separating  equilibrium.  The  reader  will 
check  that  this  least-cost  separating  equilibrium  indeed  satisfies  the  intui- 
tive criterion.  Let  us  next  discuss  the  pooling  equilibria  (when  they  exist, 
i.e.,  when  pooling  deters  entry).   Let  us  show  that  pooling  at  q.  <  q  does 

not  satisfy  the  intuitive  criterion:   Consider  the  deviation  to  q  .   This 

^m 

deviation  is  dominated  for  the  high-cost  type  ("J  =  H"),  who  makes  a  lower 
first-period  profit  and  cannot  increase  his  second-period  profit.   Thus, 
posterior  beliefs  after  q^  should  be  p  =0,  and  entry  is  deterred.   But, 
then  the  low-cost  type  would  want  to  produce  q  .   This  reasoning,  however, 
does  not  apply  to  pooling  equilibria  with  q,  >  <!_•   Deviations  to  produce 
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less  are  not  dominated  for  any  type.   Thus,  one  gets  a  (smaller)  continuum  of 
pooling  equilibria  (the  intuitive  criterion  here  has  less  cutting  power  than 
in  the  Spence  signaling  game  —  see  Kreps  [l984])- 

One  can  restrict  the  set  of  pooling  equilibria  that  satisfy  the  intui- 
tive criterion  by  invoking  Pareto  dominance:   The  pooling  equilibrium  at  q 
Pareto  dominates  pooling  equilibria  with  q^  >  q  (both  types  of  player  1  are 
closer  to  their  static  optimum,  and  player  2  does  not  care).  But,  we  are 
still  left  with  a  separating  and  a  pooling  equilibria,  which  cannot  be  ranked 
using  Pareto  dominance  (player  2  prefers  the  separating  equilibrium).   For 
further  refinements  in  the  context  of  limit  pricing,  see  Cho  [1986]. 

(iii)  Guessing  Which  Equilibrixun  One  is  in  (McLennan  [1985]) 

McLennan' s  idea  is  that  a  move  is  more  likely  if  it  can  be  explained  by 
a  confusion  over  which  PBE  is  played.  He  calls  an  action  "useless"  if  it  is 
not  part  of  some  PBE  path.  Posterior  beliefs  at  some  unreached  information 
set  must  assign  positive  probability  only  to  nodes  that  are  part  of  some  PBE, 
if  any  (i.e.,  to  actions  which  are  not  useless).  One  thus  obtains  a  smaller 
set  of  PBE,  and  one  can  operate  this  selection  recursively  until  one  is  left 
with  "justifiable  equilibria"  (which,  for  finite  games,  are  stable). 

iv)  Getting  Rid  of  Out-of-Equilibri\im  Events 

As  we  explained,  the  indeterminacy  of  beliefs  for  out-of-equilibrium 
events  is  often  a  factor  of  multiplicity.   The  previous  criteria  (as  well  as 
the  one  presented  in  the  next  section)  try  to  figure  out  what  posterior  be- 
liefs are  reasonable  in  such  events.   An  alternative  approach,  which  was 
pioneered  by  Saloner  [l  981  ]  and  Matthews-Mirman  [l  983]  consists  in  perturbing 
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the  game  slightly  so  that  these  zero-probability  events  do  not  occur.   The 
basic  idea  of  this  technique  is  to  let  the  action  chosen  by  an  informed  play- 
er be  (at  least  a  bit)  garbled  before  it  is  observed  by  his  opponents.   For 
instance,  one  could  imagine  that  a  firm's  capacity  choice  is  observed  with  an 
error  or  that  a  manufacturer's  price  is  garbled  at  the  retail  level.   By 
introducing  noise,  all  (or  most)  potentially  received  signals  are  equilibrium 
ones  and,  thus,  refinements  are  useless.   Although  the  class  of  games  to 
which  this  technique  can  be  applied  is  limited  (the  noise  must  represent  some 
reasonable  economic  phenomenon) ,  this  way  of  proceeding  seems  natural  and  is 
likely  to  select  the  "reasonable"  equilibria  of  the  corresponding  ungarbled 
game  in  the  limit  (as  Saloner,  for  instance,  shows  in  the  limit  pricing 
game) . 

4D.  Finite  Games;  Existence  and  Refinements  in  Finite  Games 

We  now  informally  discuss  refinements  that  are  defined  only  for  finite 
games.   Some  of  these  refinements  (Selten,  Myerson)  rest  on  the  idea  of  tak- 
ing the  limit  of  equilibria  with  "totally  mixed  strategies."   One  basically 
considers  robustness  of  each  PBE  to  slight  perturbations  of  the  following 
form:   each  agent  in  the  game  tree  is  forced  to  play  all  his  potential  ac- 
tions with  some  (possibly  small)  probability,  i.e.,  to  "tremble."  This  way, 
Bayes'  rule  applies  everywhere  (there  is  no  off-the-equilibrium-path  out- 
come).  To  be  a  bit  more  formal,  assume  that  an  agent  is  forced  to  put  weight 
(probability)  CT(a)  on  action  a  where  a(a)  >   e(a)  >  0  (for  each  action  a). 
Then  the  agent  can  maximize  his  payoff  given  these  constraints  and  pick  a 
best  perturbed  strategy.  A  refined  equilibrium  is  a  PBE  which  is  the  limit 
of  equilibria  with  totally  mixed  strategies,  where  the  limit  is  taken  for  a 
given  class  of  perturbations.  The  other  two  refinements  we  discuss  (Kreps- 
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Wilson,  Kohlberg-Mertens)  employ  somewhat  similar  ideas.  ¥e  shall  present 
the  refinements  in  an  increasing-strength  order. 

General  existence  results  for  equilibria  of  dynamic  games  with  incom- 
plete information  have  been  provided  only  for  games  with  a  finite  number  of 
actions  and  types,  starting  with  Selten.   We  sketch  the  proof  of  existence  of 
a  trembling  hand  equilibrium  below.  Proofs  of  existence  for  alternative 
refinements  are  similar. 

i)  Sequential  Equilibriiim  (Kreps-Wilson  [1982]) 

Kreps-Wilson  look  at  PBE  which  satisfy  a  consistency  requirement.   The 
set  of  strategies  and  beliefs  at  each  information  set  of  the  game  must  be  the 
limit  of  a  sequence  of  sets  of  strategies  and  beliefs  for  which  strategies 
are  always  totally  mixed  (and  beliefs  are  thus  pinned  down  by  Bayes'  rule.) 
Moreover,  the  beliefs  on  all  players  are  derived  as  the  limit  corresponding 
to  a  common  sequence  of  strategies.   The  strategies  and  beliefs  are  not  a 
priori  required  to  form  a  PBE  of  a  perturbed  game.   So,  the  check  is  purely 
mechanical;  given  a  PBE,  it  suffices  to  show  that  it  is  or  is  not  the  limit 
of  a  sequence  of  totally  mixed  strategies  and  associated  beliefs. 

Let  us  now  discuss  the  consistency  requirement.   In  the  simple  signaling 
game  considered  above,  it  has  no  bite,  and  a  PBE  is  also  sequential,  as  is 
easily  seen  (by  choosing  adequately  the  trembles  in  player  1 's  strategy,  one 
can  generate  any  beliefs  one  wants) .   Sequential  equilibrium  has  more  cutting 
power  in  more  complex  games  because  it  imposes  consistent  beliefs  between  the 
players  (or  agents)  off  the  equilibrium  path.  For  instance,  if  there  are  two 
receivers  in  the  signaling  game  (players  2  and  5)»  these  two  players  should 
form  the  same  beliefs  as  to  player  1's  type  when  observing  the  latter' s  ac- 
tion. This  property  comes  from  the  fact  that  at  each  stage  of  the  converging 
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sequence,  players  2  and  5's  Bayesian  updating  uses  the  same  trembles  by  play- 
er 1  and,  thus,  reach  the  same  conclusion.   Similarly,  sequential  equilibrium 
requires  consistency  of  a  player's  beliefs  over  time.   Kreps  and  Wilson  have 
shown  that  for  "almost  all"  games,  the  sequential  equilibrium  concept  coin- 
cides with  the  perfect  equilibrium  concept  (see  below).   For  the  other  (non 
generic)  games,  it  allows  more  equilibria.   Selten  requires  the  strategies  in 
the  perturbed  game  to  be  optimal  given  the  perturbed  strategies.   But,  unless 
the  payoff  structure  exhibits  ties,  this  condition  has  no  more  bite  than  the 
consistency  requirement  of  Kreps-Wilson. 

ii)  Trembling-hand  Perfect  Equilibrium  (Selten  [1973]) 

In  developing  his  notion  of  the  "trembling  hand"  perfection  Selten  be- 
gins by  working  with  the  normal  form.   An  equilibrium  is  "trembling-hand 
perfect  in  the  normal  form"  if  it  is  the  limit  of  equilibria  of  "e-perturbed" 
games  in  which  all  strategies  have  at  least  an.e  probability  of  being  played. 
That  is,  in  an  e-perturbed  game,  players  are  forced  to  play  action  a  with 
probability  of  at  least  e(a),  where  the  e(a)  are  arbitrary  as  long  as  they 
all  exceed  e.  The  e(a)  are  called  "trembles."  The  idea  of  introducing 
trembles  is  to  give  each  node  in  the  tree  positive  probability,  so  that  the 
best  responses  at  each  node"  are  well-defined.   The  interpretation  of  the 
trembles  is  that  in  the  original  game  if  a  player  unexpectedly  observes  a 
deviation  from  the  equilibrium  path  he  attributes  this  to  an  inadvertent 
"mistake"  by  one  of  his  opponents. 

To  see  how  the  trembles  help  refine  the  equilibrium  set,  let  us  once 
again  consider  the  game  in  Figure  5  which  Selten  used  to  motivate  subgame 
perfectness. 
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The  Nash  equilibrium  {U,R}  is  not  the  limit  of  equilibria  with  trembles: 
if  player  1  plays  D  with  some  probability,  player  2  puts  as  much  weight  as 
possible  on  L. 

However,  Selten  notes  that  his  refinement  is  not  totally  satisfactory. 
Consider  Figure  11,  which  is  a  slight  variation  on  the  previous  game.   Player 
1  moves  at  "dates"  1  and  3« 

PUT  FIGURE  11  HERE 

The  only  subgame  perfect  equilibrium  is  {L^,!^,!.'}.  But  the  subgame- 

imperfect  Nash  equilibrivim  (R.  ,R2»R]  }  is  the  limit  of  equilibria  with 

2 
trembles.   To  see  why,  let  player  1  play  (L.  ,L>')  with  probability  e  and 

(L^,RJ)  with  probability  e.  Then  player  2  should  put  as  much  weight  as  pos- 
sible on  Rp,  because  flayer   1's  probability  of  "playing"  B.!    conditional  on 
having  "played"  L.  is 
£/(e  +  E  )  =1  for  e  small. 

When  perturbing  the  normal  form,  we  are  allowing  for  a  type  of  correla- 
tion between  a  player's  trembles  at  different  information  sets.   In  the  above 
example,  if  a  player  "trembles"  onto  L.  ,  he  is  very  likely  to  tremble  again. 
This  correlation  goes  against  the  idea  that  players  expect  their  opponents  to 
play  optimally  at  any  point  in  the  game  tree,  including  those  not  on  the 
equilibrium  path. 

To  avoid  this  correlation,  Selten  introduces  a  second  refinement,  based 
on  the  "agent's  normal  form."  The  idea  is  to  treat  the  two  choices  of  player 
1  in  Figure  11  as  made  by  two  different  players,  each  of  whom  trembles  inde- 
pendently of  the  other.  More  precisely,  the  agent  normal  form  for  a  given 
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game  is  constructed  by  distinguishing  players  not  only  by  their  names  (i)  and 
their  types  (t.),  but  also  by  their  location  in  the  game  tree.   So,  for  in- 
stance, player  1  with  type  t.  playing  at  date  1  is  not  the  same  agent  as 
player  1  with  type  t.  playing  at  date  3;  or  player  1  with  type  t^  playing  at 
date  3  should  be  considered  as  a  different  agent  depending  on  his  (her)  in- 
formation at  that  date.   In  the  agent's  normal  form,  each  information  set 
represents  a  different  agent/player.   However,  different  agents  of  a  same 
player  i  with  type  t.  are  endowed  with  the  same  objective  function.  A 
"trembling  hand  perfect"  equilibrium  is  a  limit  of  equilibria  of  e-perturbed 
versions  of  the  agent's  normal  form. 

It  is  clear  that  a  trembling-hand  perfect  equilibrium  is  sequential:   We 
can  construct  consistent  beliefs  at  each  information  set  as  the  limit  of  the 
beliefs  computed  by  Bayes  rule  in  the  perturbed  games,  and  the  equilibrium 
strategies  are  sequential  given  these  beliefs.   One  might  expect  that  the 
(constrained)  optimality  requirement  along  the  converging  sequence  adds  some 
cutting  power.  However,  the  arbitrariness  of  the  E(a)  makes  perfectness  a 
weak  refinement,  as  shown  by  Kreps-Wilson' s  result  on  that  the  sets  of  the 
sequential  and  perfect  equilibria  coincide  for  generic  extensive-form  pay- 
offs. 

Let  us  now  sketch  the  proof  of  existence  of  a  trembling-hand  perfect 
equilibrium.  Remember  that  the  proof  of  existence  of  a  Bayesian  equilibrium 
consists  of  considering  {J\T.  \}   players  (i.e.,  in  introducing  one  player  per 

type) ,  and  applying  standard  existence  theorems  for  Nash  equilibrium.  More 
generally,  the  proof  for  trembling-hand  perfect  equilibrium  uses  existence  of 
a  Nash  equilibrium  on  the  agents'  normal  form.   Consider  the  perturbed  game 
in  which  the  agents  are  forced  to  play  trembles  (i.e.,  to  put  weight  at  least 
equal  to  E(a)  on  action  a).  The  strategy  spaces  are  compact  convex  subsets 
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of  a  Euclidean  space.  Payoff  functions  are  continuous  in  all  variables  and 
quasiconcave  (actually,  linear)  in  own  strategy.   So  there  exists  a  Nash 
equilibrium  of  the  agents'  normal  form  of  the  perturbed  game.   Now  consider  a 
sequence  of  equilibrium  strategies  when  e  tends  to  zero.   Because  the  strate- 
gy spaces  are  compact,  there  is  a  converging  subsequence.   The  limit  of  such 
a  subsequence  is  called  a  trembling-hand  perfect  equilibrium.^ 

¥e  should  also  note  that  Selten  works  with  the  normal  form  or  the 
agents'  normal  form;  so  do  the  next  two  refinements.   Thus,  beliefs  are  left 
implicit.  Kreps  and  Wilson's  paper  is  the  first  pure  game  theory  article  to 
put  emphasis  on  the  extensive  form  and  on  beliefs  (although  there  is  a  cur- 
rent debate  about  whether  defined  on  the  normal  or  extensive  form,  the  re- 
finements that  are  currently  easily  applicable  to  industrial  organizational 
models  put  constraints  on  beliefs  —  see  the  previous  section) . 

iii)  Proper  Equilibrium  (Myerson  [1978]) 

Myerson  considers  perturbed  games  in  which,  say,  a  player's  second  best 
action(s)  get  at  most  e  times  the  weight  of  the  first  best  action(s),  the 
third  best  action(s)  get  at  most  e  times  the  weight  of  the  second  best  ac- 
tion(s),  etc.  The  idea  is  that  a  player  is  "more  likely  to  tremble"  and  put 
weight  on  an  action  which  is  not  too  detrimental  to  him;  the  probability  of 
deviations  from  equilibriiim  behavior  is  inversely  related  to  their  costs.  As 
the  set  of  allowed  trembles  is  smaller,  a  proper  equilibrum  is  also  perfect, 
[with  such  an  ordering  of  trembles,  there  is  no  need  to  work  on  the  agent's 
normal  form.  The  normal  form  suffices.] 


Q 

°Note  that  because  payoffs  are  continuous,  the  limit  is  automatically  a 
Nash  equilibrium.   But  the  converse,  of  course,  is  not  true  (for  in- 
stance, for  games  of  perfect  information,  a  trembling-hand  equilibrium 
is  subgame  perfect,  as  is  easily  seen). 
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To  illustrate  the  notion  of  proper  equilibriiom,  consider  the  following 
game  (due  to  Myerson) : 

PUT  FIGURE  12  HERE 

This  game  has  three  pure  strategy  Nash  equilibria:  (U,L),(M,M)  and  (D,R). 
Only  two  of  these  are  perfect  equilibria:   D  and  R  are  weakly  dominated  stra- 
tegies and  therefore  cannot  be  optimal  when  the  other  player  trembles.   (M,M) 
is  perfect:   Suppose  that  each  player  plays  M  with  probability  1-2e  and  each 
of  the  other  two  strategies  with  probability  e.   Deviating  to  U  for  player 
one  (or  to  L  for  player  two)  increases  this  player's  payoff  by  (£-9e)-(-7e)  = 
-E  <  0.  However,  (M,M)  is  not  a  proper  equilibrium.   Each  player  should  put 
much  more  weight  (tremble  more)  on  his  first  strategy  than  on  his  third, 

vrhich  yields  a  lower  payoff.  But  if  player  one,  say,  puts  weight  e  on  U  and 

2  2 

e  on  D,  player  two  does  better  by  playing  L  than  by  playing  M,  as  (E-9e  )  - 

(-Ye  )  >  0  for  e  small.   The  only  proper  equilibrium  in  this  game  is  (U,L). 

iv)   Stable  Equilibrium  (Kohlberg-Mertens  [1986]) 

Ideally,  one  would  wish  a  PBE  to  be  the  limit  of  some  perturbed  equili- 
brium for  all  perturbations  when  the  size  of  these  perturbations  goes  to 
zero.   Such  an  equilibrium,  if  it  exists,  is  labelled  "truly  perfect."  Un- 
fortunately, true  perfection  may  be  out  of  this  world  (truly  perfect  equil- 
ibria tend  not  to  ezist) .  Kohlberg  and  Mertens,  to  obtain  existence,  settled 
for  "stability."  Stability  is  a  complex  criterion,  which  encompasses  the 
intuitive  criterion  mentioned  in  the  previous  section  and  other  features  as 
well.  Let  us  give  an  example  of  the  description  of  a  stable  equilibrium  in 
the  signaling  game  (this  introduction  follows  Kreps  [1984]).   Consider  two 
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totally  mixed  strategies  0.  and  a-   for  players  1  and  2,  and  two  strictly 
positive  numbers  z     and  Ep.  A{e.  ,a. }._.  perturbation  of  the  original  game  is 
such  that,  when  player  i  chooses  strategy  a.  ,  the  strategy  which  is  imple- 
mented for  him  is  a. with  probability  (l-e.)  and  a.   with  probability  e. .   Let 
(a.,  Op)  be  a  PBE  of  the  perturbed  game.  A  subset  E  of  PBE  of  the  original 
game  is  stable  if,  for  any  t)  >  0,  there  exists  an  equilibria  of  the  perturbed 
game  that  lies  no  more  than  e  from  the  set  E.  A  stable  component  is  then 
defined  as  a  minimal  connected  stable  set  of  equilibria.  Kohlberg  and 
Mertens  have  shown  that  every  gaime  has  at  least  one  stable  component,  and 
that,  for  almost  every  signaling  game,  all  equilibria  within  a  given  con- 
nected component  give  rise  to  the  same  probability  distribution  on  end- 
points. 


4E.  Perturbed  Games  and  Robust  Equilibria 

Our  earlier  discussion  of  the  Saloner/Matthews-Mirman  contribution  em- 
phasized the  robustness  of  the  solution  to  the  introduction  of  noise.   More 
generally,  robustness  to  "reasonable"  structural  changes  in  the  game  seem 
desirable.  This  leads  us  to  the  discussion  of  the  reputation-effects  model 
of  Kreps-Wilson-Milgrom-Roberts  [1982],  which  is  one  of  the  most  important 
applications  of  the  theory  of  dynamic  games  of  incomplete  information. 

This  work  actually  started  with  a  robustness  issue:   In  the  finite  hori- 
zon repeated  prisoners'  dilemma  the  only  equilibrim  is  "fink,  fink"  at  each 
period.   As  we  observed  in  Section  2,  this  conclusion  seems  extreme  for  long, 
finite  games;  in  response,  the  four  authors  decided  to  perturb  the  prisoner's 
dilemma  game  slightly  by  introducing  a  small  probability  that  each  party  is 
willing  to  play  the  suboptimal  strategy  tit-for-tat.   Similarly,  in  the  con- 
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text  of  example  1 ,  one  could  introduce  a  probability  that  firm  1  enjoys  prey- 
ing (is  crazy).   Then,  if  the  horizon  is  sufficiently  long  and  the  discount 
rate  sufficiently  small,  it  may  be  worthwhile  for  a  sane  type  (one  whose 
payoff  is  as  originally  specified)  to  pretend  at  the  start  that  it  is  a  crazy 
type.   By  cooperating  in  the  repeated  prisoners'  dilemma  game  or  preying  in 
the  predation  game,  the  sane  type  invests  in  reputation  that  will  induce  the 
other  player  to  take  actions  that  are  favorable  to  the  former  (cooperate; 
stay  out).   Thus,  in  games  that  are  repeated  for  a  long  time,  a  small  differ- 
ence in  information  can  make  a  big  difference  in  terms  of  outcome. 

Fudenberg-Maskin  [l  986]  develop  the  reputation-effects  model  to  its 
logical  conclusion.   They  show  that,  for  any  e,  when  the  horizon  goes  to 
infinity,  all  individually  rational  payoffs  of  a  finitely  repeated,  full- 
information  game  can  arise  as  PBE  of  a  slightly  perturbed,  incomplete  infor- 
mation game,  in  which  the  objective  function  of  each  player  is  the  one  of  the 
original  game  with  probability  (l-e)  and  can  be  any  "crazy"  objective  func- 
tion with  probabiity  e.   In  the  Friedman  tradition,  the  result  that  one  can 
obtain  any  payoff  Pareto  superior  to  a  Nash  payoff  is  easy  to  derive:   Con- 
sider a  Nash  equilibrium  of  the  original  game  ("fink,  fink"  in  the  repeated 
prisoners'  dilemma)  and  an  allocation  that  dominates  this  Nash  equilibrium, 
and  the  corresponding  prescribed  strategies  ("cooperate,  cooperate").   Sup- 
pose that  with  probability  e,  each  player  has  the  following  objective  func- 
tion:  "I  like  to  play  the  strategy  corresponding  to  the  superior  allocation 
as  long  as  the  others  have  followed  their  corresponding  strategies;  if  some- 
body has  deviated  in  the  past,  my  taste  commands  me  to  play  my  Nash  equili- 
brium strategy  forever."  Now  suppose  that  the  horizon  is  long.   Then  by 
cooperating,  each  player  loses  some  payoff  at  most  over  one  period  if  the 
other  player  deviates.  When  deviating,  he  automatically  loses  the  gain  of 
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being  able  to  cooperate  with  the  crazy  type  until  the  end.   So,  as  long  as 
there  remains  enough  time  until  the  end  of  the  horizon,  ("enough"  depends  on 
e)  deviating  is  not  optimal.   The  proof  for  points  that  do  not  dominate  a 
Nash  equilibrium  is  harder. 

The  reputation  effects  papers  show  that  adding  a  small  e  of  incomplete 
information  to  a  long  but  finitely  repeated  game  could  make  virtually  any- 
thing into  a  PBE.  However,  for  any  fixed  horizon,  a  sufficiently  small  e  of 
the  form  they  considered  has  no  effect.   If  we  admit  the  possibility  that 
players  have  private  information  about  their  opponents'  payoffs,  then  even  in 
a  fixed  extensive  form,  the  sequential  rationality  requirements  of  PBE  com- 
pletely lose  their  force.  More  precisely,  any  Nash  equilibrium  of  an  exten- 
sive form  is  a  PBE  (indeed,  a  stable  PBE)  of  a  perturbed  game  in  which  pay- 
offs differ  from  the  original  ones  with  vanishingly  small  probability. 

PUT  FIGURE  13  HERE 

Consider  the  game  in  Figure  13*  Player  1  has  two  possible  types  t.  and 
tp,  with  Prob(t  =  t^ )  =  1-e.  When  t  =  t.  ,  the  game  is  just  as  in  the  game  of 
Figure  5,  where  the  backwards  induction  equilibrium  was  (D,C).  When  t  =  tp, 
though,  player  2  prefers  R  to  L.   The  strategies  (U. ,Dp,R)  are  a  PBE  for  this 
game;  if  player  2  sees  D,  he  infers  that  t  =  tp.   Thus,  a  "small"  perturba- 
tion of  the  game  causes  a  large  change  in  play  —  player  1  chooses  U  with 
probability  (l-e).  Moreover,  this  equilibrium  satisfies  all  the  currently 
known  refinements. 

Most  of  these  refinements  proceed  by  asking  what  sort  of  beliefs  are 
"reasonable"  —  what  should  players  expect  following  unexpected  events?  If 
they  have  very  small  doubts  about  the  structure  of  the  game,  the  unexpected 
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may  signal,  as  here,  that  things  are  indeed  other  than  had  previously  seemed 
likely.  This  point  is  developed  in  Fudenberg-Kreps-Levine  [198?]. 

Thus,  small  changes  in  information  structure  can  always  extend  the  set 
of  predictions  to  include  all  of  the  Nash  equilibria,  and  in  long  repeated 
games  the  "robustness"  problem  is  even  more  severe.   What  then  is  the  predic- 
tive content  of  game  theory?   In  real  world  situations,  it  may  be  the  case 
that  only  some  types  are  unlikely  (most  types  of  "craziness"  are  not  plau- 
sible).  The  players  may  then  have  a  fairly  good  idea  of  what  game  is  played. 
However,  the  economist,  who  is  an  outsider,  may  have  a  hard  time  knowing 
which  information  structure  is  the  relevant  one.   Thus,  one  may  think  of  a 
situation  in  which,  at  the  same  time,  the  players  are  following  the  Kreps- 
Wilson-Milgrom-Roberts  strategies,  and  the  reputation  literature  is  of  little 
help  to  the  economist.   If  this  is  true,  the  economist  should  collect  infor- 
mation about  the  way  real  world  players  play  their  games  and  which  informa- 
tion structure  they  believe  they  face,  and  then  try  to  explain  why  particular 
sorts  of  "craziness"  prevail. 

The  above  implies  a  fairly  pessimistic  view  of  the  likelihood  that  game 
theory  can  hope  to  provide  a  purely  formal  way  of  choosing  between  PBEs.   It 
would  be  rash  for  us  to  assert  this  position  too  strongly,  for  research  on 
equilibrium  refinements  is  proceeding  quite  rapidly,  and  our  discussion  here 
may  well  be  outdated  by  the  time  it  appears  in  print.   However,  at  present  we 
would  not  want  to  base  important  predictions  solely  on  formal  grounds.   In 
evaluating  antitrust  policy,  for  example,  practicioners  will  need  to  combine 
a  knowledge  of  the  technical  niceties  with  a  sound  understanding  of  the  work- 
ings of  actual  markets. 
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Concluding  Remark 

Our  already  incomplete  discussion  of  equilibrium  concepts  for  dynamic 
games  of  imcomplete  information  is  likely  to  be  out  of  date  very  shortly,  as 
the  pace  of  activity  in  this  field  is  very  intense,  and  current  refinements 
have  not  yet  been  tested  for  a  wide  class  of  models.   Our  purpose  was  only  to 
provide  an  introduction,  a  survey  and  some  cookbook  receipes  for  readers  who 
currently  want  to  apply  these  techniques  to  specific  games. 
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